Amanbir Singh, Developer in Delhi, India
Amanbir is available for hire
Hire Amanbir

Amanbir Singh

Verified Expert  in Engineering

Data Scientist and Back-end Developer

Location
Delhi, India
Toptal Member Since
September 13, 2021

Amanbir在数据科学、分析和后端工程方面拥有10年的经验. 他曾在一家大型多边组织和早期科技创业公司工作. Amanbir擅长与客户合作解决复杂的商业问题,并在机器学习方面拥有深厚的专业知识, data analysis, and building scalable web apps.

Portfolio

ATS Software
Artificial Intelligence (AI), Machine Learning, Python, MySQL, GPT...
Monsoon CreditTech
Python, Pandas, Django, Angular, Docker, Kubernetes, Machine Learning...
IISD Experimental Lakes Area Inc - Main
机器学习,数据科学,Python, PostgreSQL,亚马逊网络服务(AWS)...

Experience

Availability

Full-time

Preferred Environment

Python, Data Analytics, Data Science, Machine Learning, Pandas, Generative Pre-trained Transformers (GPT), OpenAI GPT-3 API, Minimum Viable Product (MVP), Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-4 API, User Interface (UI), Product Management, Large Language Models (LLMs)

The most amazing...

...我参与的数据科学项目是从头开始构建一个用于信用风险评估的自动机器学习平台.

Work Experience

ML Developer

2023 - PRESENT
ATS Software
  • 致力于计算机视觉模型,从非结构化PDF文件(包括图纸)中提取信息, tables, etc.).
  • 对NER模型进行操作,从自然语言和非结构化文本中提取信息.
  • 采用GPT-4对AI流水线进行后处理,提高性能. 还包括基于规则的后处理,以提高管道性能.
  • 将整个平台部署在AWS SageMaker上,并与客户端的堆栈集成.
  • Trained multimodal models to improve NER performance.
Technologies: Artificial Intelligence (AI), Machine Learning, Python, MySQL, GPT, Amazon SageMaker, Computer Vision, Named-entity Recognition (NER), Object Detection, Text Detection, Generative AI, Supervised Learning

Head of Product and Engineering

2016 - PRESENT
Monsoon CreditTech
  • Led the development of the SaaS AutoML platform as an architect and product manager; made wireframes, wrote user and functional requirements, decided on back-end architecture, and ran sprints using Django, Angular, Jenkins, and Docker.
  • Architected AutoML libraries used internally. The platform generated machine learning models optimized for lending.
  • 担任我们内部数据科学团队使用的开发人员工具的产品经理和架构师,以加快模型开发和部署.
  • Managed client engagements with 15 banks and NBFCs; built and deployed models to identify risky borrowers at the time of application. Increased revenue for the client by 20% and more.
  • Hired and managed a team of 10+ data scientists and software developers. 进行一对一的指导,为团队设定目标,并指导初级成员.
  • 为支持多个和多阶段模型的机器学习模型构建了自动部署流程.
Technologies: Python, Pandas, Django, Angular, Docker, Kubernetes, Machine Learning, Data Science, Machine Learning Operations (MLOps), XGBoost, Jupyter Notebook, SQL, Data Analytics, Data Visualization, Data Mining, Web Scraping, Data Reporting, Artificial Intelligence (AI), Agile, Data Analysis, Time Series, Time Series Analysis, Optimization, Financial Modeling, Amazon Web Services (AWS), MySQL, Azure, Scikit-learn, Statistics, Statistical Analysis, Real-time Data, Predictive Analytics, APIs, Banking & Finance, Architecture, Leadership, Automation Scripting, Scripting, AWS Lambda, REST APIs, Amazon S3 (AWS S3), HTML, Decision Trees, Data Scientist, Natural Language Processing (NLP), Recommendation Systems, Regression, PDF Scraping, Scraping, Back-end, Software Architecture, Azure ML Studio, Git, Amazon DynamoDB, PostgreSQL, Non-performing Loans (NPL), Data Scraping, TypeScript, NumPy, MongoDB, Serverless, Predictive Modeling, Customer Segmentation, Visualization, Django REST Framework, Full-stack Development, API Integration, AI Design, Automation, Full-stack, CSS, Flask, Solution Architecture, Software Development, PyPDF2, openpyxl, Microservices, Advisory, Technology Strategy & Architecture, Databases, Web Development, CTO, DevOps, Google Cloud Platform (GCP), JavaScript, Object-relational Mapping (ORM), Technical Leadership, Database Architecture, Agile Software Development, Data Structures, Amazon SageMaker, ETL, Minimum Viable Product (MVP), Requirements Analysis, Startups, Mathematics, Task Scheduling, Regular Expressions, Sockets, Linear Regression, Data-driven Decision-making, Decision Modeling, Neural Networks, Programming, Integration, User Interface (UI), Cloud, Models, Exploratory Data Analysis, EDA, Modeling, Data Cleaning, Unstructured Data Analysis, Large Data Sets, Data Gathering, Spreadsheets, Machine Learning Automation, Amazon Elastic Container Service (Amazon ECS), Data Processing, Product Management, Amazon EC2, Back-end Development, Azure Cosmos DB, GitHub, Azure Functions, Azure Blobs, Scrapy, Large Language Models (LLMs), Regression Modeling, Language Models, FastAPI, Containerization, Vertex, System Architecture, Product Roadmaps, Product Strategy, Team Leadership, Project Management, Azure Machine Learning, Pytest, Unit Testing, Statistical Modeling, NoSQL, Object-oriented Programming (OOP), Research, Cloud Computing, Unsupervised Fraud Detection, Unsupervised Learning, Supervised Learning, Open-source LLMs

Data Scientist | ML Expert

2023 - 2024
IISD Experimental Lakes Area Inc - Main
  • 利用气象数据开发了一个模型来预测一个湖泊冰融化的日期. The prediction was within a day of the actual ice melting date.
  • Used boosting, bagging, and other algorithms to improve performance.
  • 使用React创建了一个仪表板来显示模型预测和性能.
Technologies: 机器学习,数据科学,Python, PostgreSQL,亚马逊网络服务(AWS), React, Gradient Boosting, Scikit-learn, Google Earth, Statistical Modeling, Object-oriented Programming (OOP), Cloud Computing, Generative AI, Supervised Learning

Data Scientist

2023 - 2023
Independent Research Group
  • 创建了一个模拟来模拟不同经济参与者(公司)之间的相互作用, employees, non-economic participants, etc.).
  • 通过马尔可夫链模拟来了解不同初始状态和干预的影响.
  • Created output visualizations and statistics to test hypotheses.
Technologies: Data Science, Agent-based Modeling, R, Python, Markov Chain Monte Carlo (MCMC) Algorithms, Monte Carlo Simulations, Simulations, Unsupervised Learning

AI/ML Developer

2023 - 2023
America Interpretation
  • 开发了一个实时翻译API,可以跨任何语言将语音转换为语音.
  • 在Django中构建了一个后端来处理流音频数据,并返回翻译后的音频数据和转录. The back end also addressed meeting creation and meeting joining.
  • Created a front end in React and used RecordRTC to capture audio. 建立WebSocket连接,允许音频流到后端.
  • Deployed both front and back end on Azure services.
  • Integrated with multiple translation and speech generation services.
Technologies: Python, Artificial Intelligence (AI), Machine Learning, Text to Speech (TTS), Speech to Text, Natural Language Processing (NLP), React, Azure Text to Speech, Elementor, Django, Azure, OpenAI, WebSockets, TypeScript, JavaScript, RecordRTC, Voice Recognition, Language Models, Prompt Engineering, System Architecture, New Product Development, Object-oriented Programming (OOP), Cloud Computing, Generative AI

AI/ML Expert/Consultant

2023 - 2023
Harbor
  • Did prompt engineering to improve LLM model predictions.
  • Compared open-source LLMs against closed models.
  • Self-hosted open-source LLMs on the company's infrastructure.
  • 在Python中构建一个提示测试框架来比较和改进提示.
Technologies: OpenAI GPT-3 API, GPT, OpenAI GPT-4 API, Generative Pre-trained Transformers (GPT), Generative Pre-trained Transformer 3 (GPT-3), AIOps, Machine Learning Operations (MLOps), Natural Language Processing (NLP), Graphics Processing Unit (GPU), AI Design, Amazon SageMaker, Hugging Face, ChatGPT, Amazon EC2, Back-end Development, GitHub, LangChain, Pinecone, Large Language Models (LLMs), OpenAI, LlamaIndex, Language Models, Prompt Engineering, Containerization, System Architecture, Project Management, Retrieval-augmented Generation (RAG), Llama 2, NoSQL, Object-oriented Programming (OOP), Research, Cloud Computing, Generative AI, Open-source LLMs

AI/ML Engineer

2023 - 2023
Grown Unknown, LLC
  • 开发使用OpenAI api生成定制父母建议的提示.
  • Added context to the prompts to tailor the tone of the outputs.
  • 将OpenAI与其他选项进行比较,制定未来产品开发计划.
Technologies: Python, Machine Learning, Language Models, OpenAI GPT-4 API, OpenAI GPT-3 API, GPT, Data Scientist, Language Learning, Generative Systems, Natural Language Processing (NLP), ChatGPT, Large Language Models (LLMs), OpenAI, Prompt Engineering, System Architecture, Generative AI

Machine Learning Expert

2023 - 2023
AmpVis Ltd.
  • 建议客户构建MVP,包括所需的所有技术步骤.
  • Decided on team structures to handle different product decisions.
  • Consulted on hiring decisions for other technical roles.
Technologies: Python, Machine Learning, Artificial Intelligence (AI), Data Science, APIs, Google Vision API, Amazon Rekognition, Programming, Cloud, Models, Data Scientist, Generative Systems, Deep Learning, Large Language Models (LLMs), Product Roadmaps, Product Strategy

Data Scientist

2023 - 2023
NewCloud Medical LLC
  • 构建了一个Looker Studio仪表板,以显示基于过滤器的数据和汇总统计信息.
  • 在Looker Studio中添加可视化功能,以从数据中生成见解.
  • 创建了根据所选字段动态更新的仪表板视图.
Technologies: Python, PDF Scraping, Scraping, Databases, Looker, Programming, Language Models, GPT, Data Cleaning, Data Scientist, Spreadsheets, Data Processing, Large Language Models (LLMs)

Research Coordinator

2015 - 2016
JustJobs Network
  • 建立内部数据管理系统来跟踪数据集的版本.
  • 领导了印度职业培训和技能建设项目的研究. Led data collection and analysis; published a findings report.
  • 设计统计学和R培训模块,用于新员工培训.
Technologies: Python, R, Data Analytics, Data Visualization, Data Mining, Web Scraping, Data Reporting, Data Analysis, Statistics, Statistical Analysis, Automation Scripting, Scripting, Data Scientist, Regression, Scraping, Git, Predictive Modeling, Visualization, Automation, Mathematics, Linear Regression, Data-driven Decision-making, Decision Modeling, Programming, Models, Exploratory Data Analysis, EDA, Modeling, Data Cleaning, Unstructured Data Analysis, Data Gathering, Spreadsheets, Data Processing, Regression Modeling, Project Management, Statistical Modeling, Research, Supervised Learning

Consultant

2014 - 2015
World Bank Group
  • 监督全州范围内4500个个人和家庭调查的数据收集.
  • 建立模型以确定影响青少年教育和劳动力市场结果的因素.
  • Participated in the dissemination of research findings.
Technologies: R, Data Science, Data Analytics, Data Visualization, Data Mining, Data Reporting, Data Analysis, Statistics, Statistical Analysis, Automation Scripting, Scripting, Regression, Git, Predictive Modeling, Visualization, ETL, Mathematics, Linear Regression, Data-driven Decision-making, Decision Modeling, Programming, Models, Exploratory Data Analysis, EDA, Modeling, Data Cleaning, Unstructured Data Analysis, Data Gathering, Spreadsheets, Data Processing, Regression Modeling, Project Management, Statistical Modeling, Research, Unsupervised Learning, Supervised Learning

Senior Research Associate

2012 - 2014
Centre for Microfinance Research
  • 管理两项随机对照试验,研究印度金融准入的影响.
  • 培训和监督一个由30名成员组成的实地小组,在四个地区进行1 700项个人调查.
  • 使用Open Data Kit和SurveyCTO设计并实施了6份电子问卷,并构建了调查数据的后端.
Technologies: STATA, Survey Design, Open Data Kit, Data Visualization, Data Mining, Data Reporting, Data Analysis, Causal Inference, Statistics, Statistical Analysis, Automation Scripting, Regression, Visualization, Mathematics, Linear Regression, Data-driven Decision-making, Models, Exploratory Data Analysis, EDA, Modeling, Data Cleaning, Unstructured Data Analysis, Data Gathering, Spreadsheets, Data Processing, Regression Modeling, Project Management, Statistical Modeling, Research, Unsupervised Learning, Supervised Learning

AutoML Platform for Lenders

http://monsoonfintech.com/thoth/
建立了一个AutoML平台,从贷款人那里获取数据,并生成最先进的机器学习模型. Supports traditional financial data and alternate (SMS, mobile, etc.) data.

该平台为新应用程序生成模型,并帮助收集运行贷款. This was offered as a SaaS product.

Custom Machine Learning Models for Lenders

http://monsoonfintech.com/
管理一个由开发人员和数据科学家组成的团队,为贷款人构建模型. This included models that predicted the risk of loan applications, recommendation engines for financial products, and marketing models to reach out to identify target customers.

Built and delivered models to the largest lenders in India. 这使得拖欠率降低了30%,贷款批准率提高了25%.

Report for the World Bank

http://documents.worldbank.org/en/publication/documents-reports/documentdetail/866381523450216235/a-window-of-opportunity-a-diagnostic-of-adolescent-girls-and-young-women-s-socio-economic-empowerment-in-jharkhand-india
Worked closely with the World Bank to identify critical challenges, along with key reforms, that adolescent girls in Jharkhand, India were facing.

我的工作包括实验设计、数据收集、分析和建模. 我还负责报告的传播和与主要利益相关者的沟通.
2008 - 2012

Bachelor's Degree in Economics and Statistics

Carnegie Mellon University - Pittsburgh, PA, USA

Libraries/APIs

Pandas, XGBoost, Scikit-learn, REST APIs, NumPy, Beautiful Soup, Sockets, Google Vision API, Amazon Rekognition, React, RecordRTC

Tools

Amazon SageMaker, ChatGPT, Git, Spreadsheets, Amazon Elastic Container Service (Amazon ECS), GitHub, Azure Machine Learning, Pytest, STATA, Open Data Kit, Azure ML Studio, Looker, Named-entity Recognition (NER)

Frameworks

Django, Django REST框架,Bootstrap, Material UI, LlamaIndex, Angular, Flask, Scrapy

Languages

Python, HTML, R, SQL, TypeScript, CSS, JavaScript

Paradigms

Data Science, Automation, Object-relational Mapping (ORM), Object-oriented Programming (OOP), Agile, Microservices, Agile Software Development, ETL, Requirements Analysis, Unit Testing, DevOps, Agent-based Modeling

Platforms

Jupyter Notebook, AWS Lambda, Amazon EC2, Docker, Amazon Web Services (AWS), Azure, Azure Functions, Kubernetes, Google Cloud Platform (GCP)

Storage

MySQL, Amazon S3 (AWS S3), PostgreSQL, MongoDB, Databases, Database Architecture, Azure Cosmos DB, Azure Blobs, NoSQL, Amazon DynamoDB

Industry Expertise

Project Management, Banking & Finance

Other

Machine Learning, Data Analytics, Data Mining, Web Scraping, Artificial Intelligence (AI), Data Analysis, Statistics, Statistical Analysis, Predictive Analytics, APIs, Architecture, Automation Scripting, Scripting, Decision Trees, Data Scientist, Natural Language Processing (NLP), Regression, PDF Scraping, Scraping, Back-end, Software Architecture, Non-performing Loans (NPL), Data Scraping, Predictive Modeling, Customer Segmentation, Visualization, Full-stack Development, API Integration, Software Development, PyPDF2, Advisory, Technology Strategy & Architecture, Web Development, CTO, Technical Leadership, Generative Pre-trained Transformers (GPT), OpenAI GPT-3 API, Minimum Viable Product (MVP), Startups, Regular Expressions, Linear Regression, Data-driven Decision-making, Programming, Integration, Models, GPT, Exploratory Data Analysis, EDA, Modeling, Data Cleaning, Unstructured Data Analysis, Large Data Sets, Data Gathering, Machine Learning Automation, Data Processing, Back-end Development, Regression Modeling, Large Language Models (LLMs), OpenAI, Prompt Engineering, System Architecture, Product Roadmaps, Product Strategy, New Product Development, Team Leadership, Statistical Modeling, Unsupervised Learning, Supervised Learning, Machine Learning Operations (MLOps), Data Visualization, Data Reporting, Time Series, Time Series Analysis, Real-time Data, Leadership, Recommendation Systems, Serverless, AI Design, Full-stack, Solution Architecture, Data Structures, Generative Pre-trained Transformer 3 (GPT-3), Mathematics, Task Scheduling, OpenAI GPT-4 API, Decision Modeling, Neural Networks, Cloud, Language Models, Language Learning, Generative Systems, Product Management, LangChain, Speech to Text, Voice Recognition, FastAPI, Containerization, Retrieval-augmented Generation (RAG), Llama 2, Research, Cloud Computing, Unsupervised Fraud Detection, Generative AI, Open-source LLMs, Survey Design, SaaS, Optimization, Financial Modeling, Causal Inference, openpyxl, User Interface (UI), Deep Learning, AIOps, Graphics Processing Unit (GPU), Hugging Face, Pinecone, Text to Speech (TTS), Azure Text to Speech, Elementor, WebSockets, Vertex, Gradient Boosting, Google Earth, Markov Chain Monte Carlo (MCMC) Algorithms, Monte Carlo Simulations, Simulations, Computer Vision, Object Detection, Text Detection

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring