Selahattin Gungormus, Developer in Istanbul, Turkey
Selahattin is available for hire
Hire Selahattin

Selahattin Gungormus

Verified Expert  in Engineering

Data Engineer and Developer

Location
Istanbul, Turkey
Toptal Member Since
May 4, 2021

Selahattin是一名数据工程师,拥有多年使用开源技术构建可扩展数据集成解决方案的实践经验. 他擅长使用Hadoop等分布式处理平台开发数据应用程序, Spark, and Kafka. Selahattin在AWS和Azure等云架构类型方面也有实践经验, as well as developing microservices using Python and JavaScript frameworks

Portfolio

Afiniti
Apache Spark, Python, Redis, Greenplum, Kubernetes, TypeScript, SQL...
Iyzico
Apache Airflow, Spark, Spark Streaming, Python, ETL Development...
Majestech
Apache Spark, Python, Apache Airflow, Node.js, Hadoop, SQL, Data Modeling...

Experience

Availability

Part-time

Preferred Environment

Apache Airflow, Visual Studio Code (VS Code), Apache Spark, Amazon Web Services (AWS), Azure, Jupyter Notebook

The most amazing...

...我所做的是构建一个利用Apache Spark进行数据处理的产品,并且可以通过拖放可视化界面进行操作.

Work Experience

Lead Data and Back-end Engineer

2019 - PRESENT
Afiniti
  • Built a highly scalable, containerized data integration platform using Spark, Docker/Kubernetes, Python, and Greenplum database.
  • Wrapped up whole data pipeline procedures in an easy-to-deploy templating system, capable of running at scale with good performance. That effort made the data pipeline process 70% faster.
  • Created data models and pipelines for the application, resulting in powering dashboard reports with over 10 million events.
  • Established and standardized CI/CD pipeline processes across the team using Jenkins, Bitbucket, and Kubernetes.
  • Built and maintained an app's back-end service using Node.js, JavaScript, and GraphQL.
Technologies: Apache Spark, Python, Redis, Greenplum, Kubernetes, TypeScript, SQL, Data Modeling, Database Design, Apache Kafka, Data Pipelines, Data Engineering

Senior Data Engineer

2019 - 2019
Iyzico
  • 通过使用Airflow创建新的技术堆栈,重新设计和优化了现有的数据管道流程, Python, Spark, and Exasol database.
  • 完成了超过300个数据流水线作业从Talend迁移到新数据平台,将每日ETL性能提高了60%(从8小时提高到3小时).
  • 使用Spark Streaming和Kafka创建了从事务系统到仪表板的实时数据提要. 这个新功能提高了高峰时段性能监控的操作效率.
  • 通过AWS进行整合,向AWS Redshift服务提供每日数据集市,为全球董事会提供每日报告.
Technologies: Apache Airflow, Spark, Spark Streaming, Python, ETL Development, Amazon Web Services (AWS), Data Engineering

Owner | Big Data Engineer | Instructor

2015 - 2019
Majestech
  • 提供咨询和培训服务,以利用基于云的替代方案(如Amazon Web services和Azure)转换中小企业的数据架构.
  • Delivered over ten data integration projects for businesses in the retail, banking, and telecommunications sectors. Transformed data integration processes to utilize cloud platforms such as AWS and Azure.
  • 构建一个点击流数据应用程序,收集应用程序用户的网络痕迹,并以最小的延迟将它们存储在数据湖中. Used Kafka and Spark Streaming on AWS as the technology base.
  • Launched a cloud-based data integration product: Integer8 on the AWS platform.
  • 为想要利用Hadoop和Spark分布式处理能力的非开发人员数据专业人员构建了一个可视化界面.
  • 与Cloudera合作提供大数据工程培训(超过20次培训).
  • 使用Apache气流和S3连接器在AWS Snowflake Cloud DB上创建数据集成管道.
Technologies: Apache Spark, Python, Apache Airflow, Node.js, Hadoop, SQL, Data Modeling, Apache Kafka, Amazon EC2, Amazon Web Services (AWS), Data Engineering

Data Engineer

2012 - 2015
i2i Systems
  • 使用Python实现数据质量测试自动化,并使用Oracle元数据信息生成日常自动化任务,评估日常管道上可能存在的问题.
  • 创建每日集成管道,为ODS和RDS层上的企业数据仓库提供数据.
  • 为某电信运营商构建了一个市场优化项目的数据准备层. 来自3500多万用户的数据从五个不同的源系统收集到Oracle Data Integrator的非规范化数据结构中.
Technologies: Oracle, PL/SQL, Data Warehouse Design, Oracle Data Integrator 11g, Python, Data Pipelines, Data Engineering

Integer8 Data Integrator

http://www.f6s.com/integer8
一个可视化数据集成产品,设计用于在带有拖放组件的web应用程序上运行. 任何没有编码经验的数据专业人员都可以使用它来构建具有100%视觉体验的数据管道. It leverages the Apache Spark execution engine and works on top of Hadoop platforms.

2015年,我和两名开发人员创建了自己的初创公司,在本地和国际市场上推出Integer8产品. I designed and led the development effort to make the product feasible for local SMEs. At the end of the first year, we deployed our platform to two different retail companies.

我成为了微软Azure在土耳其的云合作伙伴,并花了一年多的时间使Integer8符合Azure Marketplace的资格. 在这一努力的最后,Integer8成功成为Azure Marketplace的官方产品.

Data Warehouse Transformation for a Mobile Payment Company

在分布式芹菜上运行的Python/Spark数据架构上,超过300个数据管道任务从Talend转换为Airflow. The daily denormalized payment dataset is refreshed on Azure Blob Storage. The daily ETL duration was reduced by 60%.

作为负责新数据平台的数据工程师,设计并实现了整个数据管道流程. 我建立了一个CDC机制,从MySQL数据库到Kafka,提供一个pub/sub-event系统,用于近乎实时的集成. 然后,我准备了实时Spark Streaming作业来使用Kafka主题来刷新目标数据存储. 这有助于营销和运营团队监控系统上的工作量并发现异常情况.

所有数据源都被合并到Tableau报表层的两个主要数据集市中. 每日预聚合表帮助实时报表的执行速度比以前的实现快400%. 这也增加了整个组织的高级用户使用报告工具的动机.

Cloud ETL Automation on AWS

我为一个客户端项目准备了一个云ETL自动化解决方案,使用AWS Lambda和Python. In this project, 我负责连接新事件的REST api和事件源,并从AWS S3 Buckets收集CRM信息. 我使用AWS EC2层在Python环境中为S3和Pandas提供额外的支持. Individual Lambda functions were created to collect, 清理和转换无服务器架构上的数据源,并将其写入目标数据库.

As the target database I used Amazon Redshift. 因此,单个事件从Amazon EventBridge发送到Lambda函数中,并累积到Redshift数据库中以供进一步分析.

Languages

Python, SQL, JavaScript, Scala, Snowflake, TypeScript

Frameworks

Apache Spark, Hadoop, Spark

Tools

Apache Airflow, Amazon CloudWatch

Paradigms

ETL, MapReduce, Database Design

Storage

PL/SQL、数据库、数据管道、Redis、Greenplum、HDFS、HBase、Apache Hive、Amazon S3 (AWS S3)

Other

Data Modeling, Data Warehousing, Data Warehouse Design, ETL Development, Data Engineering, Data Architecture, Big Data Architecture, OOP Designs, Data Structures, Algorithms

Libraries/APIs

Spark Streaming, Node.js, Pandas

Platforms

Azure, Apache Kafka, Oracle, Amazon Web Services (AWS), Docker, Visual Studio Code (VS Code), Kubernetes, Jupyter Notebook, Oracle Data Integrator 11g, Google Cloud Platform (GCP), AWS Lambda, Amazon EC2

2005 - 2010

Bachelor's Degree in Computer Engineering

Istanbul Technical University - Istanbul, Turkey

SEPTEMBER 2013 - PRESENT

Cloudera Certified Developer for Apache Hadoop

Cloudera

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring