Libraries.io Releases Data on Over 25m Open Source Software Repositories ◆ by Benjamin Nickolls ◆ Libraries.io ◆ Medium
13 Nov 2019 - 25 Mar 2024
May | MAR | Apr |
![]() |
25 | ![]() |
2022 | 2024 | 2025 |
success
fail Share via My Web Archive Sign InGet some help using the Wayback MachineClose the toolbar
screenshotvideoShare on FacebookShare on Twitter
COLLECTED BY
Collection: Save Page Now
TIMESTAMPS
The Wayback Machine - https://web.archive.org/web/20240325010337/https://medium.com/libraries-io/libraries-io-releases-data-on-over-25m-software-repositories-ab1db665826e
Sign up
Sign up
Libraries.io Releases Data on Over 25m Open Source Software Repositories
·
Published in
·
2 min read
·
Jun 15, 2017
–
1
Listen
Share
Today’s software relies on a core set of of free, openly licensed components, frameworks and systems. But our shared, digital infrastructure is under threat. It’s overburdened and under-supported.
Nadia Eghbal’s Roads and Bridges study for the Ford Foundation gave us a series of personal vignettes on the state of open source — stressed maintainers, fractured communities and financial trouble. The stories we read resonated with our own experiences, our concerns legitimised and amplified.
Something has to change.
For nearly three years, Libraries.io has been gathering data on the complex web of interdependency that exists in open source software. We’ve published a series of experiments using harvested metadata to highlight projects in need of assistance, projects with too few contributors and too little attention.
Many of these are projects that are essential to making much of today’s software work. And these are projects that need our help. This problem is already attracting the attention of our community, as shown by this month’s Sustain conference in San Francisco.
Today Libraries.io is releasing data on over two million unique infrastructure-type projects under a permissive licence. We believe that this information will empower and accelerate the work of those seeking solutions, and kickstart the conversation as to what will lead a sustainable economy for key open source projects.
The data is available in its raw format at on Zenodo and soon we’ll be publishing a structured, queryable dataset on Google’s BigQuery. This data is published under a Creative Commons BY-SA-4.0 licence. It’s an open and permissive licence that commits the user to re-distributing their work, and their understanding.
We’ll be updating this data on a regular basis. Find out more at https://libraries.io/data.
As a community, we’re yet to agree a way to ensure the continuing health of individual open source projects. But we hope this will be another step in the right direction.
Follow
·Editor for
Product guy at @octobox. Formerly @tidelift via @librariesio and @dependencyci. Part time game designer and co founder of @atpcardgame.
Follow