Skip to content

πŸ“¦ A personal knowledge base for learning Docker and Podman with a focus on data engineering workflows. Includes notes, examples, Dockerfiles, Compose setups, and mini-projects for practice.

Notifications You must be signed in to change notification settings

AminMosallanejad339/Docker-Podman_Theo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Docker & Podman for Data Engineering

This repository is my personal knowledge base and practice space for learning Docker and Podman with a focus on data engineering use cases.
It contains notes, examples, Dockerfiles, and mini-projects that I can revisit anytime.


πŸ“‚ Repository Structure

docker-data-engineering-notes/
β”œβ”€β”€ README.md                   # Main introduction (this file)
β”œβ”€β”€ concepts/                   # Core concepts and theory
β”‚   β”œβ”€β”€ docker_basics.md
β”‚   β”œβ”€β”€ podman_basics.md
β”‚   β”œβ”€β”€ docker_vs_podman.md
β”‚   └── cheatsheet.md
β”œβ”€β”€ dockerfiles/                # Example Dockerfiles
β”‚   β”œβ”€β”€ Dockerfile_Guide.md
β”‚   β”œβ”€β”€ python_app/
β”‚   β”œβ”€β”€ postgres/
β”‚   └── airflow/
β”œβ”€β”€ compose-examples/           # Docker Compose examples
|   β”œβ”€β”€ DockerCompose_Guide.md
β”‚   β”œβ”€β”€ postgres.yml
β”‚   β”œβ”€β”€ airflow.yml
β”‚   └── kafka_spark.yml
β”œβ”€β”€ mini-projects/              # Hands-on projects for practice
β”‚   β”œβ”€β”€ 01_docker_postgres/
β”‚   β”œβ”€β”€ 02_docker_airflow/
β”‚   └── 03_kafka_pipeline/
└── notes/                      # Extra notes and troubleshooting
    β”œβ”€β”€ networking.md
    β”œβ”€β”€ volumes.md
    β”œβ”€β”€ security.md
    └── troubleshooting.md

🎯 Goals
  • Learn Docker basics (images, containers, volumes, networks).
  • Compare Docker vs Podman and understand when to use each.
  • Practice data engineering workflows using Docker:
    • PostgreSQL in containers
    • Apache Airflow for ETL pipelines
    • Kafka + Spark for streaming
  • Build a personal cheatsheet for quick reference.
  • Document troubleshooting tips I encounter along the way.

πŸ› οΈ Tools Covered
  • Docker: Core containerization tool.
  • Podman: Docker alternative, daemonless and rootless.
  • Docker Compose: Multi-container setup.
  • Data Engineering Tools in Containers:
    • PostgreSQL
    • Apache Airflow
    • Apache Kafka
    • Apache Spark

πŸ“– How to Use
  1. Clone this repo:
    git clone https://github.com/<your-username>/docker-data-engineering-notes.git
    cd docker-data-engineering-notes
    
  2. Explore notes in the concepts/ and notes/ folders.
  3. Run examples from dockerfiles/ or compose-examples/.
  4. Try the mini-projects/ for hands-on practice.

βœ… Status

  • Set up repo structure
  • Add Docker basics notes
  • Add Podman basics notes
  • Create PostgreSQL mini-project
  • Add Airflow mini-project
  • Add Kafka + Spark pipeline

πŸ“Œ References


✍️ This repo is a living document. I’ll keep updating it as I learn more about Docker, Podman, and data engineering workflows.

About

πŸ“¦ A personal knowledge base for learning Docker and Podman with a focus on data engineering workflows. Includes notes, examples, Dockerfiles, Compose setups, and mini-projects for practice.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published