Debian Conference 2025, Brest, Jul 14th - 19th.

Debian in the Research Software Ecosystem:

A Bibliometric Analysis

 
Joenio M. Costa and Christina von Flach
Institute of Computing, Federal University of Bahia

Joenio Marques da Costa

http://joenio.me

Institute of Computing, Federal University of Bahia

  • PhD candidate in Software Engineering
  • Work at Cortext Platform (www.cortext.net)
  • Debian Developer

Research Software

X

Software in Research

Multi-Dimensional Research Software Categorization

Bibliometric Analysis

Research Strategy

The data were retrieved from the Scopus database on June 5, 2025.

  • Query string “TITLE-ABS-KEY (debian)”
  • A total of 473 results were found.

RQ 1 - Publication count per year

What is the annual number of publications?

“Including Diagnostic Information in Configuration Models (2000), Tommi Syrjänen”

“As an example, a subset of the configuration problem for the Debian GNU/Linux system is formalized using the new rule-based language.”

RQ 2 - Top-cited papers

Which papers have been cited the most by other papers?

“Meep: A flexible free-software package for electromagnetic simulations by the FDTD method (2010), Ardavan F. Oskooi”

Operating system: Any Unix-like system; developed under Debian GNU/Linux 5.0.2.

RQ 3 - Most active researchers

Who are the most active researchers, measured by number of published papers?

  • Zacchiroli S. (13 papers), German D.M. (10 papers), Di Cosmo R. (9 papers) and Robles G. (9 papers).

“The Ultimate Debian Database: Consolidating bazaar metadata for Quality Assurance and data mining (2010), Zacchiroli S.”

“A Model to Understand the Building and Running Inter-Dependencies of Software (2007), German D.M.”

RQ 4 - Active countries

Which countries are contributing the most, based on the affiliations of the researchers?

USA (28 papers), France (20 papers), Canada (19 papers) and Germany (17 papers).

RQ 5 - Top venues

Which venues (i.e., conferences, journals) are the main targets of papers?

“Lecture Notes in Computer Science (LNCS) (16 papers)”

RQ 6 - Top-relevant terms and frequent words

What are the most relevant terms and concepts in the field?

  • Most popular terms: “open source”, “operating system”, “linux” and “open systems”.
  • Lack of relevance for the term “free software”

Future Work

  • Collect data from other databases besides Scopus
  • Network of collaboration, institutions, countries
  • Add other Linux distros
    • Ubuntu, Linux Mint, Fedora Linux, Red Hat Enterprise Linux (RHEL), openSUSE, SUSE Linux Enterprise, Arch Linux, Manjaro, Gentoo, Alpine Linux.
  • Cross bibliometric data X source code metric
  • Cross bibliometric data X source package metric

Thanks!

joenio@joenio.me


This presentation is available at:

http://joenio.me/debconf2025-academictrack-talk

Export this presentation as PDF (require chromium browser)

(source-code: https://gitlab.com/joenio/joenio.gitlab.io)

License Creative Commons

Presentation history

Where and when this presentation was done