ohai.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A cozy, fast and secure Mastodon server where everyone is welcome. Run by the folks at ohai.is.

Administered by:

Server stats:

1.8K
active users

#datatalksclub

7 posts1 participant0 posts today
emmuzoo<p>📊 Final product: 3 dashboards in Looker Studio with key insights on SF bike usage in 2023–2024.</p><p>From messy CSVs to visual stories — loving this data journey 🚴‍♀️</p><p><a href="https://mastodon.social/tags/dezoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dezoomcamp</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p>
emmuzoo<p>🔍 Project Goals:<br>• Avg trip time &amp; distance<br>• Most common bike type<br>• Most active user type<br>• Peak ride hours<br>• Most popular stations</p><p>Happy to say: mission accomplished ✅</p><p><a href="https://mastodon.social/tags/dezoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dezoomcamp</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p>
emmuzoo<p>📦 Raw bike trip data and Bay Area counties were loaded into GCS, transformed with dbt, and stored in BigQuery.</p><p>Every piece automated with Kestra flows and IaC with Terraform 💪</p><p><a href="https://mastodon.social/tags/dezoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dezoomcamp</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p>
emmuzoo<p>⚡ E-bikes are on the rise in SF!</p><p>This project revealed fascinating insights about how different users move around the city on bikes.</p><p>Infrastructure: Terraform<br>Orchestration: Kestra<br>Transformations: dbt<br>Warehouse: BigQuery</p><p><a href="https://mastodon.social/tags/dezoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dezoomcamp</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p>
emmuzoo<p>👀 Curious when people ride shared bikes most often?</p><p>I used Bay Wheels data to analyze hourly &amp; weekly usage trends and visualized them with Looker Studio.</p><p>GCP + dbt + Kestra = smooth orchestration 💡</p><p><a href="https://mastodon.social/tags/dezoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dezoomcamp</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p>
emmuzoo<p>🛠️ From raw CSVs to dashboards! Built a modern data pipeline using GCS, Terraform, Kestra, and dbt to analyze shared bike usage patterns in the SF Bay Area.</p><p>Found answers to key questions like trip duration, user types &amp; popular stations.</p><p><a href="https://mastodon.social/tags/dezoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dezoomcamp</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p>
emmuzoo<p>🚴‍♂️ Just wrapped up a data engineering project analyzing Bay Wheels bike trips in the San Francisco Bay Area using real data from 2023-2024.</p><p>Used Terraform, Kestra, dbt, BigQuery &amp; Looker Studio to build a full batch data pipeline.</p><p><a href="https://mastodon.social/tags/dezoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dezoomcamp</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/dataengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dataengineering</span></a></p>
emmuzoo<p>Mission complete! ✅ Just finished the <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> homework: identifying the longest uninterrupted streak of taxi rides in a 5-minute window using <a href="https://mastodon.social/tags/Kafka" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kafka</span></a> and <a href="https://mastodon.social/tags/Flink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Flink</span></a>. Feeling proud of the progress so far! 🚖</p><p><a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/Streaming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Streaming</span></a> <a href="https://mastodon.social/tags/Kafka" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kafka</span></a> <a href="https://mastodon.social/tags/Flink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Flink</span></a> <a href="https://mastodon.social/tags/PyFlink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PyFlink</span></a> <a href="https://mastodon.social/tags/DataScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataScience</span></a> <a href="https://mastodon.social/tags/TaxiNY" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TaxiNY</span></a></p>
emmuzoo<p>Wrapping up Module 6 of <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> and diving into the homework! 📝 The task: find the longest uninterrupted streak of taxi rides in a 5-minute window using <a href="https://mastodon.social/tags/Kafka" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kafka</span></a> and <a href="https://mastodon.social/tags/Flink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Flink</span></a>. Challenge accepted! 🚖💨</p><p><a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/Streaming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Streaming</span></a> <a href="https://mastodon.social/tags/Kafka" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kafka</span></a> <a href="https://mastodon.social/tags/Flink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Flink</span></a> <a href="https://mastodon.social/tags/PyFlink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PyFlink</span></a> <a href="https://mastodon.social/tags/DataScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataScience</span></a></p>
emmuzoo<p>Halfway through Module 6 of <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a>! 🖥️ Learning how to process real-time data streams with <a href="https://mastodon.social/tags/Kafka" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kafka</span></a> and <a href="https://mastodon.social/tags/Flink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Flink</span></a>. I’m now applying what I've learned to the Taxi NY Green dataset. Excited for what comes next! 🚖✨</p><p><a href="https://mastodon.social/tags/Streaming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Streaming</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/Kafka" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kafka</span></a> <a href="https://mastodon.social/tags/Flink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Flink</span></a> <a href="https://mastodon.social/tags/PyFlink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PyFlink</span></a> <a href="https://mastodon.social/tags/DataScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataScience</span></a></p>
emmuzoo<p>Just kicked off Module 6 of the <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> Zoomcamp by @DataTalksClub! 🎉 It's all about <a href="https://mastodon.social/tags/StreamProcessing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>StreamProcessing</span></a> with <a href="https://mastodon.social/tags/Kafka" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kafka</span></a>, <a href="https://mastodon.social/tags/Flink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Flink</span></a>, and <a href="https://mastodon.social/tags/PyFlink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PyFlink</span></a>. Can't wait to get hands-on with real-time data streaming! 🖥️🚀</p><p><a href="https://mastodon.social/tags/Streaming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Streaming</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/Kafka" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kafka</span></a> <a href="https://mastodon.social/tags/Flink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Flink</span></a> <a href="https://mastodon.social/tags/PyFlink" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>PyFlink</span></a></p>
emmuzoo<p>🎉 Final Step: Successfully Built Analytical Views!<br>The project is complete! After transforming the data, I’ve created models that serve analytical views for various queries in BigQuery. The combination of dbt and BigQuery makes data engineering a smooth ride. Grateful for all the learning in this module! <a href="https://mastodon.social/tags/dbt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dbt</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://mastodon.social/tags/GCP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GCP</span></a> <a href="https://mastodon.social/tags/Analytics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Analytics</span></a></p>
emmuzoo<p>🔧 Optimizing Data with dbt Models<br>I’ve been creating dbt models for multiple queries across the Green Taxi, Yellow Taxi, and FHV datasets in BigQuery. From source tables to final reports, it's amazing to see how dbt handles dependencies, testing, and version control. Ready to run the first models! <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://mastodon.social/tags/dbt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dbt</span></a> <a href="https://mastodon.social/tags/GCP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GCP</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a></p>
emmuzoo<p>📊 Exploring dbt for Data Transformation<br>The journey continues! In this part of the project, I'm learning how dbt models help automate data transformation. I'm building out models in dbt for these taxi datasets to create clean, analysis-ready data in <a href="https://mastodon.social/tags/BigQuery" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BigQuery</span></a>. It’s fascinating to see how everything connects! <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> <a href="https://mastodon.social/tags/dbt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dbt</span></a> <a href="https://mastodon.social/tags/GCP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GCP</span></a> <a href="https://mastodon.social/tags/ETL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ETL</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a></p>
emmuzoo<p>🚀 Started Module 4 of <a href="https://mastodon.social/tags/DataEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataEngineering</span></a> Zoomcamp!<br>Just kicked off the Analytics Engineering module and I'm diving into transforming the Green Taxi, Yellow Taxi, and FHV NY Taxi datasets loaded in <a href="https://mastodon.social/tags/BigQuery" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BigQuery</span></a>. Excited to see how dbt can help create analytical views for better decision-making! <a href="https://mastodon.social/tags/dbt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dbt</span></a> <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/GCP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GCP</span></a> <a href="https://mastodon.social/tags/AnalyticsEngineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AnalyticsEngineering</span></a> <a href="https://mastodon.social/tags/ETL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ETL</span></a></p>
vanalex<p>This week at the Data Engineer Zoomcamp 2025 by <a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <br> , we're diving into Data warehouse and Big Query. </p><p>Special mention to Michael Shoemaker for the insightful lessons, and to <br>Alexey Grigorev for organizing the sessions.</p><p>Let's continue this learning journey together</p>
emmuzoo<p>🔚 Final Results &amp; Lessons Learned<br>🏆 4th (Public LB) – RMSE: 12.2324<br>🏅 5th (Private LB) – RMSE: 9.5624<br>Key takeaways:<br>✔ Feature engineering &amp; selection are crucial<br>✔ Encoding strategies impact model performance<br>✔ Hyperparameter tuning makes a real difference! 🚀</p><p><a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/zoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>zoomcamp</span></a> <a href="https://mastodon.social/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a></p>
emmuzoo<p>🔧 Hyperparameter Optimization<br>Tuned XGBoost using Optuna, a powerful Bayesian optimization library. Finding the best hyperparameters helped lower RMSE and improve generalization! 🔥⚡<br><a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/zoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>zoomcamp</span></a> <a href="https://mastodon.social/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a></p>
emmuzoo<p>⚙️ Model Choice: XGBoost<br>Why?<br>✅ Handles missing data well<br>✅ Great with tabular data<br>✅ Efficient and highly tunable<br>XGBoost was the perfect choice for this structured dataset! 📈💡<br><a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/zoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>zoomcamp</span></a> <a href="https://mastodon.social/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a></p>
emmuzoo<p>🛠️ Preprocessing<br>For high-cardinality categorical features, I used Target Encoding.<br>For low-cardinality categorical features, I applied Ordinal Encoding.<br>Missing values? Used SimpleImputer (most_frequent) to fill them efficiently. 🚀<br><a href="https://mastodon.social/tags/DataTalksClub" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataTalksClub</span></a> <a href="https://mastodon.social/tags/zoomcamp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>zoomcamp</span></a> <a href="https://mastodon.social/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a></p>