josise Molano,波哥大开发商<e:1> -哥伦比亚波哥大
José is available for hire
Hire José

José Molano

Verified Expert  in Engineering

Data Engineer and Developer

Location
Bogotá - Bogota, Colombia
Toptal Member Since
November 1, 2022

jos是一名数据工程师,在提取方面有六年多的经验, transform and load (ETL) pipeline development, data warehouse and data lake design, query performance tuning, and database cloud infrastructure management. With his background spanning multiple domains, josise设计并构建了可扩展的数据平台,用于票据交换和转售等环境, urban traffic, customer service, and tax evasion analysis.

Portfolio

Globant
Apache气流,AWS Lambda, Redshift,雪花,亚马逊RDS...
SKG Tecnologia
Google Cloud Platform (GCP), Python, Apache Kafka, Pandas, PostgreSQL, Flask...
Alianza CAOBA
Apache Spark, Python, AWS Glue, AWS Lambda, Cloudera, MySQL, Docker...

Experience

Availability

Part-time

Preferred Environment

Apache Airflow, Apache Spark, Snowflake, BigQuery, Amazon RDS, MySQL, Python, Redshift, Terraform, AWS Glue

The most amazing...

...我构建的产品是一个自动化管道,用于使用Apache Airflow和AWS将tb级数据从MongoDB数据库迁移到数据仓库.

Work Experience

Senior Data Engineer

2020 - PRESENT
Globant
  • 执行作业自动化和调度与气流,以支持每月会计审查.
  • 开发了用于摄取外部数据源的ETL管道, such as MongoDB, into Snowflake using Apache Airflow and AWS.
  • 使用AWS数据库迁移服务(DMS)计划和执行事务性数据库迁移.
  • Implemented AWS CloudWatch, Datadog monitors, OpsGenie集成了关键的数据库指标,如CPU和内存消耗以及复制延迟.
  • 管理Amazon关系数据库服务基础设施和相关资源, 例如Amazon Virtual Private Cloud安全组和Terraform源代码控制中的参数组.
  • 改进了Amazon Aurora MySQL的SQL查询性能, 通过在关键表上引入索引和分区来减少执行时间和云基础设施成本.
Technologies: Apache气流,AWS Lambda, Redshift,雪花,亚马逊RDS, Amazon Elastic Container Service (Amazon ECS), Terraform, MySQL, Docker, Amazon CloudWatch, Amazon S3 (AWS S3), Data Engineering, Amazon Web Services (AWS), Boto, MariaDB, High Availability Disaster Recovery (HADR), Database Optimization, AWS HA, AWS Database Migration Service (DMS), Databases, Database Migration, Data Pipelines, Relational Databases, GitHub, Data, Reporting, Data Transformation, CRM APIs, CSV, Python Boolean, Boolean Search, ETL Tools, Data Migration, Data Management

Data Engineer

2020 - 2020
SKG Tecnologia
  • 使用Apache Kafka和Python实现了城市交通移动数据的流摄取管道.
  • 设计基于BigQuery的流量分析应用.
  • 使用PostgreSQL配置SQL数据库云基础设施,并使用Google云平台进行管理.
  • 设计仪表盘,处理与城市交通速度分析相关的数据.
Technologies: Google Cloud Platform (GCP), Python, Apache Kafka, Pandas, PostgreSQL, Flask, Docker, Data Engineering, JSON, Databases, Relational Databases, GitHub, Data, Data Transformation, Data Visualization, Microsoft Power BI, Data Analytics, Excel 365, Reports, CSV, Python Boolean, Boolean Search, Visualization, ETL Tools, Data Management, Spark

Data Technical Lead

2018 - 2020
Alianza CAOBA
  • 使用VirtualBox设计并开发了一个大数据实验室环境, Apache Hadoop, Apache Ambari, and Cloudera, reducing feature development and deployment time.
  • 使用Apache Spark和Apache Hive开发了一个工具,用于在大数据集上匿名化敏感客户信息.
  • 使用Amazon Elastic Compute Cloud (Amazon EC2)设计和开发健康分析应用程序, Amazon S3, and Amazon Athena.
  • 使用PostgreSQL提供SQL数据库虚拟基础架构,管理数据库管理. 设计了面向客户服务和零售用例的实体-关系模型,并提供了数据库访问和用户授权.
  • 设计和开发Microsoft Power BI仪表板,用于分析由自然语言处理(NLP)机器学习模型产生的数据.
Technologies: Apache Spark, Python, AWS Glue, AWS Lambda, Cloudera, MySQL, Docker, Apache Airflow, Data Engineering, JSON, ETL, Amazon Web Services (AWS), Databases, Data Visualization, GitHub, Data, NumPy, Reporting, Data Transformation, Microsoft Power BI, Data Analytics, Excel 365, Office 365, CSV File Processing, Data Analysis, Elasticsearch, JavaScript, Node.js, Reports, CSV, Python Boolean, Boolean Search, Visualization, ETL Tools, Tableau, Amazon Athena, Amazon Neptune, Data Management, Azure, Spark

Big Data Developer

2016 - 2017
Alianza CAOBA
  • 创建了一个自动化管道,用于计算波哥大建筑项目的预期税额, Colombia using pandas.
  • 开发了一个自动化管道,用于清洁和处理城市交通机动性,使其可通过使用Apache Spark的交互式仪表板使用.
  • Managed the big data infrastructure, providing new services such as MongoDB, Apache Spark, Apache Hive, and Hadoop Distributed File System (HDFS).
  • 设计和开发Microsoft Power BI仪表板,用于分析城市交通和移动数据.
Technologies: Apache Spark, Apache Hive, HDFS, Python, MongoDB, APIs, REST, JSON, ETL, OCR, Databases, Data Pipelines, Relational Databases, GitHub, Data, NumPy, Reporting, Data Transformation, Excel 365, Office 365, CSV File Processing, Data Analysis, Data Visualization, Microsoft Power BI, Data Analytics, Reports, CSV, Python Boolean, Boolean Search, Visualization, ETL Tools, Tableau, Spark

基于传感器数据流的适应性日常生活活动识别

http://www.sciencedirect.com/science/article/pii/S1877050918304551
该项目提出了一个适应性日常生活(ADL)发现系统,该系统考虑了个人行为变化和对隐私的尊重等因素.

该系统在真实用户的数据集下进行了测试和验证. 结果表明,该方法在具有相应约束条件的实际场景中能够很好地运行.

该项目的主要贡献是ADL检测系统,该系统可以适应用户行为变化而无需重新训练模型, considering sensor failures, and preserving user privacy.

ADACOP:开放政府数据的大数据平台

ADACOP是一个开放的政府大数据工具,用于监控和开发来自开放政府数据门户的数据潜力. 该解决方案自动生成关于开放政府数据的描述性统计数据,并通过对比不同版本的数据来验证开放政府数据的质量.

用于虚拟现实和增强现实应用的低成本、低精度2D跟踪系统

http://ceur-ws.org/Vol-1957/CoSeCiVi17_paper_9.pdf
这项工作提出了一个低成本的二维位置跟踪器. 低成本特性适用于硬件和软件组件. 此外,这项工作还包括使用原型游戏应用进行用户测试. 验证过程证明了所提出的跟踪器在精度要求较低的环境下达到了可接受的性能.
2016 - 2018

Master's Degree in Computer Engineering

University of The Andes - Bogotá, Colombia

2012 - 2016

Bachelor's Degree in Computer Engineering

University of The Andes - Bogotá, Colombia

DECEMBER 2022 - DECEMBER 2025

AWS Certified Cloud Practitioner

Amazon Web Services

Libraries/APIs

Pandas, NumPy, OpenCV, Node.js

Tools

Apache Airflow, GitHub, Microsoft Power BI, Tableau, BigQuery, Terraform, AWS Glue, Cloudera, Amazon Elastic Container Service (Amazon ECS), Amazon CloudWatch, Weka, Boto, Amazon Athena

Languages

Snowflake, Python, SQL, Java, JavaScript

Paradigms

ETL, Business Intelligence (BI), REST

Platforms

亚马逊网络服务(AWS)、AWS Lambda、谷歌云平台(GCP)、Apache Kafka、Docker、Azure

Storage

MySQL, Databases, Data Pipelines, Relational Databases, MongoDB, JSON, MariaDB, Database Migration, Redshift, Apache Hive, HDFS, PostgreSQL, Amazon S3 (AWS S3), Data Lakes, Elasticsearch

Frameworks

Apache Spark, Spark, Hadoop, Flask, AWS HA

Other

Amazon RDS, Data Engineering, Data, CSV File Processing, Data Analysis, CSV, Python Boolean, Boolean Search, ETL Tools, Data Migration, Data Management, Database Optimization, AWS Database Migration Service (DMS), Data Visualization, Reporting, Data Transformation, Data Analytics, Office 365, Reports, Visualization, Streaming Data, APIs, Big Data, OCR, High Availability Disaster Recovery (HADR), CRM APIs, Excel 365, Amazon Neptune, AWS Certified Solution Architect

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

与你选择的人才一起工作,试用最多两周. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring