Data engineer

Full-time

The worldwide data management software market is massive (IDC forecasts it to be $137.6 billion by 2026!). At MongoDB we are transforming industries and empowering developers to build amazing apps that people use every day.

We are the leading modern data platform and the first database provider to IPO in over 20 years. Join our team and be at the forefront of innovation and creativity.

MongoDB is growing rapidly and seeking a Data Engineer to be a key contributor to the company’s Internal Data Platform. You will build ETL pipelines that pull data into our Data Lake / Warehouse and that will be used to drive forward our growth as a product and as a company.

You will take on complex data-related problems using very diverse data sets, and will work with stakeholder groups throughout the company to help them make better data-informed decisions.

This role can be based out of our New York City office.

Our ideal candidate has experience with

Building ETL pipelines at scale that can grow without sacrificing performance
Data Lake / Warehouse design patterns and concepts, including Delta Lakes
Several programming languages (Python, Scala, Java, etc.)
Data processing frameworks such as Spark and Pandas
Orchestration tools such as Airflow, Luiji, Azkaban, Cask, etc.
AWS services such as S3, Kinesis, EMR, Lambda, Athena, Glue, IAM, RDS, etc.
Different storage formats such as Parquet, JSON, Avro, and Arrow
Streaming data processing frameworks like Kafka, KSQL, and Spark Streaming
A diverse set of databases (MongoDB, Redshift, etc.)

You might be an especially great fit if you

Enjoy wrangling huge amounts of data and exploring new data sets
Value code simplicity and performance
Obsess over data : everything needs to be accounted for and be thoroughly tested
Plan effective data storage, security, sharing, and publishing within an organization
Constantly thinking of ways to squeeze better performance out of data pipelines

Nice to haves

You are deeply familiar with Spark and / or Hive
You have expert experience with Airflow
You understand the differences between different storage formats like Parquet, Avro, Arrow, and JSON and when to use each
You understand the tradeoffs between different schema designs like normalization vs. denormalization
In addition to data pipelines, you’re also quite good with Kubernetes, Drone, and Terraform
You’ve built an end-to-end production-grade data solution that runs on AWS or GCP
You have experience building machine learning pipelines using tools such as SparkML, Tensorflow, Scikit-Learn, etc.

Responsibilities

As a Data Engineer, you will

Help drive best practices in continuous integration and delivery
Help drive optimization, testing, and tooling to improve data quality
Collaborate with other software engineers, machine learning experts, and stakeholders, taking learning and leadership opportunities that will arise every single day

To drive the personal growth and business impact of our employees, we’re committed to developing a supportive and enriching culture for everyone.

From employee affinity groups, to fertility assistance and a generous parental leave policy, we value our employees’ wellbeing and want to support them along every step of their professional and personal journeys.

Learn more about what it’s like to work at MongoDB , and help us make an impact on the world!

MongoDB is committed to providing any necessary accommodations for individuals with disabilities within our application and interview process.

To request an accommodation due to a disability, please inform your recruiter.

MongoDB, Inc. provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type and makes all hiring decisions without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

Apply Now

Related Jobs

Data engineer

MongoDB New York, NY

APPLY

We are the leading modern data platform and the first database provider to IPO in over 20 years. Join our team and be at the forefront of innovation and creativity.

You will take on complex data-related problems using very diverse data sets, and will work with stakeholder groups throughout the company to help them make better data-informed decisions.

This role can be based out of our New York City office.

Our ideal candidate has experience with

Building ETL pipelines at scale that can grow without sacrificing performance
Data Lake / Warehouse design patterns and concepts, including Delta Lakes
Several programming languages (Python, Scala, Java, etc.)
Data processing frameworks such as Spark and Pandas
Orchestration tools such as Airflow, Luiji, Azkaban, Cask, etc.
AWS services such as S3, Kinesis, EMR, Lambda, Athena, Glue, IAM, RDS, etc.
Different storage formats such as Parquet, JSON, Avro, and Arrow
Streaming data processing frameworks like Kafka, KSQL, and Spark Streaming
A diverse set of databases (MongoDB, Redshift, etc.)

You might be an especially great fit if you

Enjoy wrangling huge amounts of data and exploring new data sets
Value code simplicity and performance
Obsess over data : everything needs to be accounted for and be thoroughly tested
Plan effective data storage, security, sharing, and publishing within an organization
Constantly thinking of ways to squeeze better performance out of data pipelines

Nice to haves

You are deeply familiar with Spark and / or Hive
You have expert experience with Airflow
You understand the differences between different storage formats like Parquet, Avro, Arrow, and JSON and when to use each
You understand the tradeoffs between different schema designs like normalization vs. denormalization
In addition to data pipelines, you’re also quite good with Kubernetes, Drone, and Terraform
You’ve built an end-to-end production-grade data solution that runs on AWS or GCP
You have experience building machine learning pipelines using tools such as SparkML, Tensorflow, Scikit-Learn, etc.

Responsibilities

As a Data Engineer, you will

Help drive best practices in continuous integration and delivery
Help drive optimization, testing, and tooling to improve data quality
Collaborate with other software engineers, machine learning experts, and stakeholders, taking learning and leadership opportunities that will arise every single day

To drive the personal growth and business impact of our employees, we’re committed to developing a supportive and enriching culture for everyone.

Learn more about what it’s like to work at MongoDB , and help us make an impact on the world!

MongoDB is committed to providing any necessary accommodations for individuals with disabilities within our application and interview process.

To request an accommodation due to a disability, please inform your recruiter.

Full-time

APPLY

Data engineer

Lorven Technologies, Inc. New York, NY

APPLY

Our client is looking Data Engineer for Long Term project in Bloomfield CT, New York NY, Austin TX, Chicago IL (Initial Remote) below is the detailed requirements.

Job Title : Data Engineer

Duration : Long Term W2 Tax Term

Job description :

Bachelor's degree in Computer science or equivalent, with minimum 9+ years of relevant experience .
Must have experience with Pyspark, Python, Angular, SQL, Azure Databricks, Metadata.
Knowledge of at least one component : Azure Data Factory, Azure Data Lake, Azure SQL DW, Azure SQL
Expertise in ETL, API development, Microservices design and Cloud deployment solutions
Experience in RESTful APIs using message formats such as JSON and XML
Experience in integration technologies such as Kafka
Experience in Python and frameworks such Flask or Django
Experience in RDBMS and NoSQL databases
Good understanding of SQL, T-SQL and / or PL / SQL
Hands-on experience developing applications on AWS and / or Openshift
Automation Skills using Infrastructure as Code
Familiarity with creating web applications using AngularJS or React
Familiarity with creating benchmark tests, designing for scalability and performance, and designing / integrating large-scale systems.
Familiarity with building cloud native applications, knowledge on cloud tools such Kubernetes and Docker containers
Demonstrate excellent communication skills including the ability to effectively communicate with internal and external customers.
Ability to use strong industry knowledge to relate to customer needs and dissolve customer concerns and high level of focus and attention to detail.
Strong work ethic with good time management with ability to work with diverse teams and lead meetings.

Full-time

APPLY

Data Engineer

LMI New York, NY

APPLY

Overview

LMI is a consultancy dedicated to powering a future-ready, high-performing government, drawing from expertise in digital and analytic solutions, logistics, and management advisory services.

We deliver integrated capabilities that incorporate emerging technologies and are tailored to customers’ unique mission needs, backed by objective research and data analysis.

Founded in 1961 to help the Department of Defense resolve complex logistics management challenges, LMI continues to enable growth and transformation, enhance operational readiness and resiliency, and ensure mission success for federal civilian and defense agencies.

This position is remote but may require travel to a client site in Washington, DC (Georgetown)*

Responsibilities

As a Data Engineer you will help develop and deploy technical solutions to solve our customers’ hardest problems, using various platforms to integrate data, transform insights, and build first-class applications for operational decisions.

You will leverage everything around you : core customer products, open source technologies (e.g. GHE), and anything you and your team can build to drive real impact.

In this role, you work with customers around the globe, where you gain rare insight into the world’s most important industries and institutions.

Each mission presents different challenges, from the regulatory environment to the nature of the data to the user population.

You will work to accommodate all aspects of an environment to drive real technical outcomes for our customers.

Core Responsibilities

Setup transfers of data feeds from source systems into location accessible to Foundry and integrate with existing data utilizing enterprise architecture best practices
Debug issues related to delayed or missing data feeds
Monitor build progress and debug build problems in conjunction with deployment teams
Using Foundry’s application development framework to design applications that address operational questions
Rapid development and iteration cycles with SME’s including testing and troubleshooting application issues
Executing requests for information (RFI’s) surrounding the platform’s data footprint

Qualifications

Bachelor’s degree in data science, mathematics, statistics, economics, computer science, engineering, or a related business or quantitative discipline (Master’s degree preferred)
Preferred : Interim or Active DoD Secret clearance.
Strong engineering background, preferably in fields such as Computer Science, Mathematics, Software Engineering, Physics, or Data Science.
Proficiency with programming languages such as Python (Pyspark, Pandas) SQL, R, JavaScript, or similar languages.
Working knowledge of databases and SQL; preferred qualifications include linking analytic and data visualization products to database connections
At least 9 years of experience in the field
Ability to work effectively in teams of technical and non-technical individuals.
Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.
Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision.
Proven track-record of strong customer communications including feedback gathering, execution updates, and troubleshooting.

LI-SH1

Options

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later. Share on your newsfeed

LMI is an Equal Opportunity Employer. LMI is committed to the fair treatment of all and to our policy of providing applicants and employees with equal employment opportunities.

LMI recruits, hires, trains, and promotes people without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, pregnancy, disability, age, protected veteran status, citizenship status, genetic information, or any other characteristic protected by applicable federal, state, or local law.

If you are a person with a disability needing assistance with the application process, please contact

Need help finding the right job?

We can recommend jobs specifically for you!

Software Powered by iCIMS

Temporary

APPLY

Senior data engineer

Fitch Ratings New York, NY

APPLY

At Fitch, we have an open culture where employees are able to exchange ideas and perspectives, throughout the organization, irrespective of their seniority.

Your voice will be heard allowing you to have a real impact. We embrace diversity and appreciate authenticity encouraging an environment where employees can be their true selves.

Our inclusive and progressive approach helps us to keep a balanced perspective. Fitch is also committed to supporting its employees by advancing conversations around diversity, equity and inclusion.

Fitch’s Employee Resource Groups (ERGs) have been established by employees who have joined together as a workplace community based on similar backgrounds or life experiences.

Fitch’s ERGs are available to connect employees with others within the organization to offer professional and personal support.

With our expertise, we are not only creating data and information, but also producing timely insights from every angle to influence decision making in this ever changing and highly competitive market.

We have a relentless hunger to innovate and unlock the power of human insights and to drive value for our customers. There has never been a better time to make an impact and we invite you to join us on this journey.

Fitch Ratings is a leading provider of credit ratings, commentary and research. Dedicated to providing value beyond the rating through independent and prospective credit opinions, Fitch Ratings offers global perspectives shaped by strong local market experience and credit market expertise.

The additional context, perspective and insights we provide have helped fund a century of growth and enables you to make important credit judgments with confidence.

At Fitch, we have an open culture where employees are able to exchange ideas and perspectives, throughout the organization, irrespective of their seniority.

Your voice will be heard allowing you to have a real impact. We embrace diversity and appreciate authenticity, employees work in an environment where they can be their true selves.

Our inclusive and progressive approach helps us to keep a balanced perspective.

With our expertise, we are not only creating data and information, but also producing timely insights from every angle to influence decision making in this everchanging and highly competitive market.

Fitch is seeking a strong Data Engineer to improve critical data systems used widely by internal and external stakeholders.

The ideal candidate is someone who :

Has 5+ years of data engineering experience developing large data pipelines
Has Strong experience developing in Python.& Java
Has Strong Experience with relational SQL and NoSQL databases

Roles & Responsibility

Build data pipelines and applications to stream and process datasets at low latencies.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, NoSQL, Kafka using AWS Big Data technologies.
Collaborate with Data Product Managers, Data Architects, and other Data Engineers to design, implement, and deliver successful data solutions.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
Track data lineage, ensure data quality and improve discoverability of data.
Work in Agile Environment (Scrum) and interact with multi-functional teams (Product Owners, Scrum Masters, Developers, Designers, Data Analysts)

Required Skills

Strong experience developing in Python & Java.
5+ years of data engineering experience developing large data pipelines
Strong SQL and NoSQL skills and ability to create queries to extract data and build performant datasets.
Hands-on experience with message queuing and stream data processing (Kafka Streams).

Desirable Skills

Experience with relational SQL and NoSQL databases, any RDBMS (Oracle, Postgres) and NoSQL (Cassandra, Mongo, or Redis, etc.).
Hands-on experience with distributed systems such as Spark, Hadoop (HDFS, Hive, Presto, PySpark) to query and process data.
Strong analytic skills related to working with unstructured datasets.
Hands-on experience in using AWS cloud services : EC2, Lambda, S3, Athena, Glue, and EMR
Redshift / Snowflake
Experience in the Financial Services industry

Person specification

Excellent problem solving and analytical skills
Highly motivated to deliver results and meet deadlines

LI-CF1

DICE

Full-time

APPLY

Data Engineer

Fourier Ltd New York, NY

APPLY

Posted byPython US RecruiterFourier has partnered with several World Leading Hedge Funds, Prop Traders, and Market Makers in a search for elite and eager Data Engineers to join them.

Our clients are looking for the best data engineers in the industry with a proven track record of delivering scalable and robust data systems and are driven by solving the seemingly unsolvable problems.

They are looking for individuals who are driven and motivated but most importantly - excited by Data!

Do you love working with Python and have experience managing ETL pipelines, building a scalable distributed data platform or deriving insights from alternative data sets?

Do you have an affinity for learning new technologies and always looking to broaden your existing technical knowledge?

These clients are at the pinnacle of finance, and therefore leadingpensation packages should be expected.

Primary Tech Stack :

Python

Data storage and manipulation tools such as SQL, Pandas, NumPy.

Various ETL / ELT Technologies.

Full-time

APPLY