Data Engineer Intern
About our Company:
Medidata: Powering Smarter Treatments and Healthier People
Medidata, a Dassault Systèmes company, is leading the digital transformation of life sciences, creating hope for millions of people. Medidata helps generate the evidence and insights to help pharmaceutical, biotech, medical device and diagnostics companies, and academic researchers accelerate value, minimize risk, and optimize outcomes. More than one million registered users across 2,000+ customers and partners access the world's most trusted platform for clinical development, commercial, and real-world data. Known for its groundbreaking technological innovations, Medidata has supported more than 33,000 clinical trials and 10 million study participants. Medidata is headquartered in New York City and has offices around the world to meet the needs of its customers. Discover more at www.medidata.com and follow us on LinkedIn, Instagram, and X.
The Program:
At Medidata, interns will have the opportunity to accelerate their careers by working closely with experienced professionals and gain valuable, hands-on, full-time work experience. By being a part of our global organization, interns have the opportunity to work alongside our talented and committed professionals helping them to build a strong foundation for achieving their career goals. For 12 weeks, beginning May 19, 2025, interns will have an opportunity to gain a deep understanding of what it means to be a Medidatian. United around a single goal of empowering smarter treatments and healthier people. Medidatians work in a culture of curiosity, innovation and fun. You will be contributing to the line of business with sustainable and meaningful work.
Our Summer Internship program also includes instructor led training, guided mentorship, exposure to senior leadership and community service. In addition to individual and specific related responsibilities, each intern will participate in our Intern Innovation Lab. Assigned to cross-functional teams, interns will work closely to develop an innovative solution to a business problem currently facing Medidata. As they work diligently to present their final solutions to a panel of top Medidata leaders, we are confident that our interns will make a significant impact on our business.
About the Team:
We are seeking a motivated and detail-oriented Data Engineer Intern to assist in the design, development, and optimization of data pipelines and databases that support business insights and decision-making. In this role, you will work closely with the Office of Social Impact and Engagement team, as well as our data and IT teams, to build scalable data solutions. This is a great opportunity for an individual looking to gain hands-on experience in data engineering, ETL processes, and database management.
Responsibilities:
- Data Pipeline Development: Assist in designing, building, and maintaining ETL (Extract, Transform, Load) processes to ingest and transform data from various sources.
- Database Design & Management: Support the development and optimization of databases to ensure efficient data storage, integrity, and retrieval.
- Data Integration: Work with structured and unstructured data sources to integrate information into centralized databases or data warehouses.
- Query Optimization: Assist in writing and optimizing SQL queries to improve database performance and enable efficient data access.
- Big Data Processing: Learn and apply big data technologies such as Apache Spark, Hadoop, or cloud-based data services.
- Data Quality & Validation: Implement data validation and cleansing techniques to ensure accuracy, consistency, and completeness of data.
- Collaboration: Work with cross-functional teams to understand data needs and contribute to scalable data solutions.
- Documentation: Maintain detailed documentation of data models, ETL processes, and database structures for future reference.
- Security & Compliance: Assist in ensuring data security, privacy, and compliance with relevant policies and best practices.
Qualifications:
- Technical Skills: Familiarity with SQL and relational database management systems (e.g., MySQL, PostgreSQL, SQL Server).
- Programming Knowledge: Exposure to Python, Scala, or Java for data manipulation and pipeline development.
- Data Processing: Understanding of ETL processes and tools such as Apache Airflow, dbt, or Talend is a plus.
- Cloud & Big Data Tools (Preferred): Experience with cloud platforms like AWS (Redshift, S3, Glue), Google Cloud (BigQuery, Dataflow), or Azure Data Services.
- Problem-Solving: Strong analytical and problem-solving skills with attention to detail.
- Communication: Ability to explain technical concepts to non-technical stakeholders.
- Collaboration: Ability to work effectively in a team-oriented environment.
- Adaptability: Willingness to learn and experiment with new data technologies and methodologies.
Preferred Qualifications:
- Experience with data warehousing concepts and tools.
- Familiarity with version control systems like Git.
- Basic understanding of data security and privacy principles.
- Exposure to data visualization tools (e.g., Tableau, Power BI, Looker) is a plus.
The salary range posted below refers only to positions that will be physically based in New York. As with all roles, Medidata sets ranges based on a number of factors including function, level, candidate expertise and experience, and geographic location. Pay ranges for candidates in locations other than New York, may differ based on the local market data in that region. The base hourly pay range for this position is $32.00 an hour and a $3,500 sign on bonus.
Equal Employment Opportunity:
In order to provide equal employment and advancement opportunities to all individuals, employment decisions at Medidata are based on merit, qualifications and abilities. Medidata is committed to a policy of non-discrimination and equal opportunity for all employees and qualified applicants without regard to race, color, religion, gender, sex (including pregnancy, childbirth or medical or common conditions related to pregnancy or childbirth), sexual orientation, gender identity, gender expression, marital status, familial status, national origin, ancestry, age, disability, veteran status, military service, application for military service, genetic information, receipt of free medical care, or any other characteristic protected under applicable law. Medidata will make reasonable accommodations for qualified individuals with known disabilities, in accordance with applicable law.
Applications will be accepted on an ongoing basis until the position is filled.
#LI-MW1
#LI-Hybrid
Diversity statement

MEDIDATA generates the evidence and insights to help pharmaceutical, biotech, medical device and diagnostics companies, and academic researchers accelerate value, minimize risk, and optimize outcomes.