ohai.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A cozy, fast and secure Mastodon server where everyone is welcome. Run by the folks at ohai.is.

Administered by:

Server stats:

1.8K
active users

#datasets

6 posts6 participants0 posts today
Winbuzzer<p>Wikipedia and Kaggle Release Structured Dataset to Aid AI Development, Counter Scraping</p><p><a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/AITraining" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AITraining</span></a> <a href="https://mastodon.social/tags/Wikipedia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Wikipedia</span></a> <a href="https://mastodon.social/tags/Kaggle" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Kaggle</span></a> <a href="https://mastodon.social/tags/AIData" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIData</span></a> <a href="https://mastodon.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://mastodon.social/tags/OpenData" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenData</span></a> <a href="https://mastodon.social/tags/Wikimedia" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Wikimedia</span></a> <a href="https://mastodon.social/tags/Datasets" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Datasets</span></a> <a href="https://mastodon.social/tags/BigData" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BigData</span></a> <a href="https://mastodon.social/tags/DataScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataScience</span></a> <a href="https://mastodon.social/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://mastodon.social/tags/NLP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NLP</span></a> <a href="https://mastodon.social/tags/Google" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Google</span></a> <a href="https://mastodon.social/tags/Alphabet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Alphabet</span></a></p><p><a href="https://winbuzzer.com/2025/04/17/wikipedia-and-kaggle-release-structured-dataset-to-aid-ai-development-counter-scraping-xcxwbn/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">winbuzzer.com/2025/04/17/wikip</span><span class="invisible">edia-and-kaggle-release-structured-dataset-to-aid-ai-development-counter-scraping-xcxwbn/</span></a></p>
Marek Pavliš 🇨🇿 🇪🇺<p>"...there is no <a href="https://mastodonczech.cz/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> without <a href="https://mastodonczech.cz/tags/energy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>energy</span></a>; at the same time, AI has the potential to transform the energy sector." </p><p>📊 This "Energy and AI" <a href="https://mastodonczech.cz/tags/report" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>report</span></a> from the International Energy Agency (<a href="https://mastodonczech.cz/tags/IEA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>IEA</span></a>) is based on new global and regional modelling and <a href="https://mastodonczech.cz/tags/datasets" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datasets</span></a>, as well as extensive consultation with governments and regulators, the <a href="https://mastodonczech.cz/tags/tech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tech</span></a> sector, the energy industry and international experts. </p><p>👉 <a href="https://www.iea.org/reports/energy-and-ai" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="">iea.org/reports/energy-and-ai</span><span class="invisible"></span></a></p><p><a href="https://mastodonczech.cz/tags/IT" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>IT</span></a> <a href="https://mastodonczech.cz/tags/data" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>data</span></a> <a href="https://mastodonczech.cz/tags/electricity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>electricity</span></a> <a href="https://mastodonczech.cz/tags/artificiallife" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>artificiallife</span></a></p>
The OpenAIRE Graph<p>Unlock <a href="https://mastodon.social/tags/research" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>research</span></a> insights with the new <a href="https://mastodon.social/tags/OpenAIREGraph" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAIREGraph</span></a> <a href="https://mastodon.social/tags/API" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>API</span></a>!</p><p>Easily discover <a href="https://mastodon.social/tags/publications" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>publications</span></a>, <a href="https://mastodon.social/tags/datasets" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datasets</span></a>, &amp; <a href="https://mastodon.social/tags/software" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>software</span></a> across <a href="https://mastodon.social/tags/OpenScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenScience</span></a> infrastructures.<br>- Search with precision using linked <a href="https://mastodon.social/tags/metadata" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>metadata</span></a><br>- Find <a href="https://mastodon.social/tags/OpenAccess" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAccess</span></a> versions &amp; related datasets<br>- Trace research back to funders &amp; institutions</p><p>Start exploring today: <a href="https://graph.openaire.eu/docs/apis/graph-api/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">graph.openaire.eu/docs/apis/gr</span><span class="invisible">aph-api/</span></a> </p><p><a href="https://mastodon.social/tags/ResearchDiscovery" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ResearchDiscovery</span></a> <a href="https://mastodon.social/tags/OpenInfrastructures" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenInfrastructures</span></a> <a href="https://mastodon.social/tags/OpenData" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenData</span></a></p>
Audio Developer Conference<p>Scalable, Efficient Processing and Analysis of Large Audio Datasets – Pawel Cyrta – ADCx Gather 2024<br><a href="https://www.youtube.com/watch?v=lHME1l9cEPk" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="">youtube.com/watch?v=lHME1l9cEPk</span><span class="invisible"></span></a><br><a href="https://mastodon.social/tags/coding" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>coding</span></a> <a href="https://mastodon.social/tags/Datasets" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Datasets</span></a> <a href="https://mastodon.social/tags/programming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>programming</span></a> <a href="https://mastodon.social/tags/softwareengineering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>softwareengineering</span></a></p>
Slimy9343<p>What <a href="https://mastodon.social/tags/databrokers" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>databrokers</span></a> are out there and where to find the <a href="https://mastodon.social/tags/datasets" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datasets</span></a> to analyze?</p><p>For those wondering what I intend to do: I want to understand how such datasets are structured and what kind of <a href="https://mastodon.social/tags/personaldata" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>personaldata</span></a> a specific data broker has collected about its customers.</p><p><a href="https://mastodon.social/tags/privacy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>privacy</span></a> <a href="https://mastodon.social/tags/datascience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datascience</span></a> <a href="https://mastodon.social/tags/education" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>education</span></a> <a href="https://mastodon.social/tags/cybersecurity" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>cybersecurity</span></a></p>

From the Data Rescue Project: the Data Rescue Tracker. “The Data Rescue Tracker is a collaborative tool built to catalog existing public data rescue efforts so that we can coordinate better across initiatives. At this stage, you can use the tool to help reduce duplication of rescue efforts. The Data Rescue Tracker aims to provide a consolidated overview of who is backing up which dataset from […]

https://rbfirehose.com/2025/04/13/the-data-rescue-tracker/

ResearchBuzz: Firehose | Individual posts from ResearchBuzz · The Data Rescue Tracker | ResearchBuzz: Firehose
More from ResearchBuzz: Firehose

"Almost two dozen repositories of research and public health data supported by the National Institutes of Health are marked for “review” under the Trump administration’s direction, and researchers and archivists say the data is at risk of being lost forever if the repositories go down.

“The problem with archiving this data is that we can’t,” Lisa Chinn, Head of Research Data Services at the University of Chicago, told 404 Media. Unlike other government datasets or web pages, downloading or otherwise archiving NIH data often requires a Data Use Agreement between a researcher institution and the agency, and those agreements are carefully administered through a disclosure risk review process.

A message appeared at the top of multiple NIH websites last week that says: “This repository is under review for potential modification in compliance with Administration directives.”
Repositories with the message include archives of cancer imagery, Alzheimer’s disease research, sleep studies, HIV databases, and COVID-19 vaccination and mortality data."

404media.co/nih-archives-repos

404 Media · Massive, Unarchivable Datasets of Cancer, Covid, and Alzheimer's Research Could Be Lost ForeverDays before Robert F. Kennedy Jr. announced that 10,000 HHS staffers would lose their jobs, a message appeared on NIH research repository sites saying they were "under review."
#USA#Trump#Datasets

Axios: NOAA research websites slated to go dark get a reprieve.”NOAA has averted the early cancellation of an Amazon Web Services contract that would have caused a slew of agency websites to go dark beginning at midnight, the agency said Friday. Why it matters: The outages mainly would have affected NOAA’s research division, and would have made numerous websites and data sets inaccessible to […]

https://rbfirehose.com/2025/04/06/axios-noaa-research-websites-slated-to-go-dark-get-a-reprieve/

Massive, Unarchivable #Datasets of #Cancer, #Covid, #HIV and #Alzheimer's Research Could Be Lost Forever
Days before RFK announced 10,000 #HHS staffers would lose their jobs, a message appeared on #NIH research repository sites saying they were "under review." Unlike other government datasets or web pages, downloading or otherwise archiving NIH data often requires a Data Use Agreement between a researcher institution and the agency.
404media.co/nih-archives-repos
archive.ph/Y8asq

404 Media · Massive, Unarchivable Datasets of Cancer, Covid, and Alzheimer's Research Could Be Lost ForeverDays before Robert F. Kennedy Jr. announced that 10,000 HHS staffers would lose their jobs, a message appeared on NIH research repository sites saying they were "under review."

#ListenBrainz / #MetaBrainz I'm confused. Aren't sponsors the true customer? Why use this? 🤔

On one hand #Music: "Listen together", "Ethical forever"

On the other: #DATASETS

"Some of the world’s biggest platforms such as Google and Amazon, use our data"

"We ask commercial supporters to support us in order to help fund the creation and maintenance of these datasets."

"The following organizations make use of the data-sets published by MetaBrainz"

"Unicorn tier: #Google, #Amazon, #Spotify"

STAT: Gold-standard maternal mortality database in limbo as CDC staff placed on leave. “As part of the sweeping layoffs that rocked the Department of Health and Human Services on Tuesday, the entire staff that oversaw an annual survey to better understand infant and maternal health — and that was considered the gold standard in the field — was placed on administrative leave. The Pregnancy […]

https://rbfirehose.com/2025/04/02/stat-gold-standard-maternal-mortality-database-in-limbo-as-cdc-staff-placed-on-leave/

Clemson News: Study: Researchers’ choices could result in different conclusions from the same data . “If you give hundreds of researchers the same data and the same hypotheses to test, they will reach the same conclusions, right? Wrong, according to a recent study published in the journal BMC Biology. Two hundred forty-six researchers in the fields of ecology and evolutionary biology — […]

https://rbfirehose.com/2025/04/01/study-researchers-choices-could-result-in-different-conclusions-from-the-same-data-clemson-news/

ResearchBuzz: Firehose | Individual posts from ResearchBuzz · Study: Researchers’ choices could result in different conclusions from the same data (Clemson News) | ResearchBuzz: Firehose
More from ResearchBuzz: Firehose

arXiv: FediverseSharing: A Novel Dataset on Cross-Platform Interaction Dynamics between Threads and Mastodon Users. “In March 2024, Threads joined this federation by introducing its Fediverse Sharing service, which enables interactions such as posts, replies, and likes between Threads and Mastodon users as if on a unified platform. Building on this development, we introduce FediverseSharing, […]

https://rbfirehose.com/2025/02/27/fediversesharing-a-novel-dataset-on-cross-platform-interaction-dynamics-between-threads-and-mastodon-users-arxiv/

ResearchBuzz: Firehose | Individual posts from ResearchBuzz · FediverseSharing: A Novel Dataset on Cross-Platform Interaction Dynamics between Threads and Mastodon Users (arXiv) | ResearchBuzz: Firehose
More from ResearchBuzz: Firehose

From handling massive #DataSets to streamlining delivery, UC Berkeley #Library is ensuring that #ResearchData is well-managed, accessible, and compliant with licensing agreements through #Dataverse, so resources are discoverable and usable by the entire university community. #RDM #DataManagement youtu.be/XVBUna3wzgk?si=c_Ixa-

youtu.be- YouTubeEnjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.