Data Engineer | Data Analyst | Cloud Enthusiast | Business Analyst
π I am an experienced Data Engineer with over 3+ years of expertise in designing, building, and optimizing complex data pipelines, ETL processes, and data architectures. I enjoy solving business challenges by leveraging scalable data solutions and using data to drive decision-making.
- π Currently working as a Data Engineer Intern at Marlabs Inc. where I optimize data workflows using Azure, Databricks, and real-time ingestion tools like Kafka.
- π» Skilled in Big Data frameworks like Apache Spark and Hadoop, with a strong command of SQL, Python, and ETL tools like SSIS and Azure Data Factory.
- βοΈ Hands-on experience with cloud platforms including Azure, AWS, and Google Cloud Platform, focusing on serverless data management and real-time processing solutions.
- π Passionate about data visualization, using tools like Power BI, Tableau, and Qlik Sense to turn complex datasets into actionable insights.
- π§ Proficient in data quality assurance, data modeling, and data governance, ensuring accuracy and consistency across diverse sources.
- Languages: SQL, Python, PL/SQL, Bash
- Libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, TensorFlow
- Automation: Git, GitHub, SQL Server Agent, CI/CD Pipelines
- Frameworks: Apache Spark, Hadoop, HDFS, Hive, Apache Kafka, Databricks
- Data Pipelines: ETL pipelines with SSIS, Azure Data Factory, real-time data ingestion with Kafka and Event Hubs
- Microsoft Azure: Azure Data Factory, Azure Synapse Analytics, Azure SQL Database, Azure Event Hubs
- AWS: S3, EC2, Lambda, IAM, RDS
- Google Cloud Platform: BigQuery, Dataflow
- Relational Databases: MySQL, PostgreSQL, SQL Server
- NoSQL: MongoDB, Cassandra, HDFS
- Visualization: Power BI, Tableau, Qlik Sense, MS Excel (Advanced)
- Real-Time Dashboards: Developed and maintained business intelligence dashboards to track key performance indicators (KPIs)
- Data Lineage & Governance: Apache Atlas, Azure Purview
- Data Quality: Implementing data validation, error handling, and governance frameworks for high data integrity
- Optimized ETL workflows across multiple industries using tools like Azure Data Factory, Databricks, and SSIS, improving data processing speeds by up to 25%.
- Integrated real-time data from diverse sources with Apache Kafka and Azure Event Hubs, ensuring seamless ingestion and processing.
- Led a cloud migration project, reducing operational costs by 25% while automating data transfer and optimizing cloud resources.
- Developed 300+ ETL packages and optimized SQL queries, enhancing system performance by 40% and ensuring smooth legacy system integration.
- Built Power BI dashboards and automated data pipelines that processed 1M+ records, reducing data processing times by 30% and improving decision-making efficiency.
- Real-time Data Ingestion: Integrated Apache Kafka and Azure Event Hubs for critical real-time operations, improving data flow efficiency.
- Predictive Modeling: Built machine learning models using scikit-learn and TensorFlow to forecast research trends, boosting engagement by 15%.
- Data Visualization Dashboards: Developed custom dashboards in Power BI and Tableau to monitor pipeline performance, enhancing decision-making and reducing downtime by 30%.
- Financial Data Migration: Automated ETL processes using SSIS and Apache Spark, optimizing data transformation for high-volume environments.
Iβm passionate about using data to drive business impact by optimizing pipelines, building predictive models, and creating actionable dashboards. I thrive on continuous learning, staying up-to-date with the latest in data engineering and cloud technologies, and enjoy collaborating across teams to solve complex problems and enhance data infrastructure.
- Enhancing my skills in distributed computing and real-time data processing.
- Experimenting with advanced machine learning techniques for predictive modeling and data analytics.
- LinkedIn: Pranay Datta Kavukuntla
Iβm always open to new opportunities and collaborations. Feel free to reach out if youβd like to discuss projects, exchange ideas, or explore how I can help with your data engineering challenges!