Social Sciences and Humanities on Big Data: a Bibliometric Analysis

Gastón Becerra, Cristian Ratovicius


Purpose: The purpose of this paper is to provide a comprehensive bibliometric review of social science, psychology, and humanities literature focusing on big data. Methods: Production and authorship trends, topics and areas as well as citations were analyzed by means of conducting a bibliometric analysis of a corpus of 5,500 Scopus articles published from 2010 to 2020. Findings: Analysis revealed similarities and differences among social science, psychology, and humanities literature in terms of publication, framing, and referencing trends as compared with the general big data literature: both fields show a steady increase, although the increase rate slowed down as from 2015; text production of both specific and general fields is led by just a few countries, with the USA and China being on top of the ranking; single authorship has been decreasing in both fields; the specificity of big data framing, in social sciences and humanities, has been identified with a critical view that surpass the ethical considerations, to include the social construction of datasets, the political and ideological uses of big data, and the discussion of its philosophical and epistemological foundations. Value: To the best of our knowledge, this is the first study to provide a comprehensive view on social sciences and humanities big data bibliometrics while providing context to compare results.


Big data, social sciences, humanities, bibliometric analysis, citation analysis

Full Text:



Ahmad, I., Ahmed, G., Shah, S. A. A., & Ahmed, E. (2020). A decade of big data literature: analysis of trends in light of bibliometrics. The Journal of Supercomputing, 76(5), 3555–3571.

Anderson, C. (2008). The end of theory. The data deluge makes the scientific method obsolete. Wired.

Batty, M. (2013). Big data, smart cities and city planning. Dialogues in Human Geography, 3(3), 274–279.

Becerra, G. (2018). Interpelaciones entre el Big data y la Teoría de los sistemas sociales. Propuestas para un programa de investigación. Hipertextos, 6(9), 41–62.

Beer, D. (2016). How should we do the history of Big Data? Big Data & Society, 3(1), 205395171664613.

Belmonte, J. L., Segura-Robles, A., Moreno-Guerrero, A. J., & Parra-González, M. E. (2020). Machine learning and big data in the impact literature. A bibliometric review with scientific mapping in web of science. Symmetry, 12(4).

boyd, danah, & Crawford, K. (2012). Critical Questions for Big Data. Information, Communication & Society, 15(5), 662–679.

Brower, R. L., Jones, T. B., Osborne-Lampkin, L., Hu, S., & Park-Gaghan, T. J. (2019). Big Qual: Defining and Debating Qualitative Inquiry for Large Data Sets. International Journal of Qualitative Methods, 18, 1–10.

Burrows, R., & Savage, M. (2014). After the crisis? Big Data and the methodological challenges of empirical sociology. Big Data & Society, 1(1), 205395171454028.

Chen, M., Mao, S., & Liu, Y. (2014). Big data: A survey. Mobile Networks and Applications, 19(2), 171–209.

Colleoni, E., Rozza, A., & Arvidsson, A. (2014). Echo Chamber or Public Sphere? Predicting Political Orientation and Measuring Political Homophily in Twitter Using Big Data. Journal of Communication, 64(2), 317–332.

Davidson, E., Edwards, R., Jamieson, L., & Weller, S. (2019). Big data, qualitative style: a breadth-and-depth method for working with large amounts of secondary qualitative data. Quality and Quantity, 53(1), 363–376.

Dijck, J. van. (2014). Datafication, dataism and dataveillance: Big data between scientific paradigm and ideology. Surveillance and Society, 12(2), 197–208.

Ebrahim, N. A., Salehi, H., Embi, M. A., Tanha, F. H., Gholizadeh, H., & Motahar, S. M. (2014). Visibility and citation impact. International Education Studies, 7(4), 120–125.

Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2), 137–144.

Gieseking, J. J. (2018). Size Matters to Lesbians, Too: Queer Feminist Interventions into the Scale of Big Data. Professional Geographer, 70(1), 150–156.

Gitelman, L. (2013). "Raw Data" Is an Oxymoron. The MIT Press.

Halavais, A. (2015). Bigger sociological imaginations: framing big social data theory and methods. Information Communication and Society, 18(5), 583–594.

Halford, S., & Savage, M. (2017). Speaking Sociologically with Big Data: Symphonic Social Science and the Future for Big Data Research. Sociology, 51(6), 1132–1148.

Hashem, I. A. T., Chang, V., Anuar, N. B., Adewole, K., Yaqoob, I., Gani, A., Ahmed, E., & Chiroma, H. (2016). The role of big data in smart city. International Journal of Information Management, 36(5), 748–758.

Hill, R. L., Kennedy, H., & Gerrard, Y. (2016). Visualizing Junk:Big Data Visualizations and the Need for Feminist Data Studies. Journal of Communication Inquiry, 40(4), 331–350.

Kalantari, A., Kamsin, A., Kamaruddin, H. S., Ale Ebrahim, N., Gani, A., Ebrahimi, A., & Shamshirband, S. (2017). A bibliometric approach to tracking big data research trends. Journal of Big Data, 4(1), 1–18.

Kitchin, R. (2014). Big Data, new epistemologies and paradigm shifts. Big Data & Society, 1(1).

Kramer, A. D. I., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences, 111(24), 8788–8790.

Laney, D. (2001). 3D Data Management: Controlling Data Volume, Velocity, and Variety (No. 2001; Vol. 949). META Group.

Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The Parable of Google Fly: Traps in Big Data Analysis. Science, 343(March), 1203–1205.

Lazer, D., Pentland, A., Watts, D. J., Aral, S., Contractor, N., Freelon, D., Gonzalez-bailon, S., & King, G. (2020). Computational social science: Obstacles and opportunities. Science, 369(6507), 13–16.

Leonelli, S. (2016). Locating ethics in data science: Responsibility and accountability in global and distributed knowledge production systems. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2083).

Liang, T. P., & Liu, Y. H. (2018). Research Landscape of Business Intelligence and Big Data analytics: A bibliometrics study. Expert Systems with Applications, 111(128), 2–10.

Lim, C., Kim, K. J., & Maglio, P. P. (2018). Smart cities with big data: Reference models, challenges, and considerations. Cities, 82(April), 86–99.

Liu, X., Sun, R., Wang, S., & Wu, Y. J. (2019). The research landscape of big data: a bibliometric analysis. Library Hi Tech, 38(2), 367–384.

Mahrenbach, L. C., Mayer, K., & Pfeffer, J. (2018). Policy visions of big data: views from the Global South. Third World Quarterly, 39(10), 1861–1882.

Mayer-Schonberger, V., & Cukier, K. (2013). Big data. A revolution that whill transform how we live, work, and think. Eamon Dolan/Houghton Mifflin Harcourt.

McCarthy, M. T. (2016). The big data divide and its consequences. Sociology Compass, 10(12), 1131–1140.

Medeiros, M. M. de, Maçada, A. C. G., & Freitas Junior, J. C. da S. (2020). The effect of data strategy on competitive advantage. Bottom Line, 33(2), 201–216.

Metcalf, J., & Crawford, K. (2016). Where are human subjects in Big Data research? The emerging ethics divide. Big Data and Society, 3(1), 1–14.

Metcalf, J., Keller, E. F., & Boyd, D. (2016). Perspectives on Big Data, Ethics, and Society.

Mützel, S. (2015). Facing Big Data: Making sociology relevant. Big Data & Society, 2(2), 205395171559917.

Newman, M. E. J. (2006). Finding community structure in networks using the eigenvectors of matrices. Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, 74(3), 1–19.

Ostrom, A. L., Parasuraman, A., Bowen, D. E., Patrício, L., & Voss, C. A. (2015). Service Research Priorities in a Rapidly Changing Context. Journal of Service Research, 18(2), 127–159.

Ponomariov, B., & Boardman, C. (2016). What is co-authorship? Scientometrics, 109(3), 1939–1963.

Roberts, M. E., Stewart, B. M., & Tingley, D. (2019). Stm: An R package for structural topic models. Journal of Statistical Software, 91.

Sadin, É. (2018). La inteligencia artificial o el desafío del siglo. Anatomía de un antihumanismo radical. Caja Negra.

Subramanyam, K. (1983). Bibliometric studies of research collaboration: A review. Journal of Information Science, 6(1), 33–38.

Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics, 37(2), 267–307.

Team R Core. (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing.

Weinhardt, M. (2020). Ethical issues in the use of big data for social research. Historical Social Research, 45, 342–368.

Xindong Wu, Xingquan Zhu, Gong-Qing Wu, & Wei Ding. (2014). Data mining with big data. IEEE Transactions on Knowledge and Data Engineering, 26(1), 97–107.

Zhang, Y., Huang, Y., Porter, A. L., Zhang, G., & Lu, J. (2019). Discovering and forecasting interactions in big data research: A learning-enhanced bibliometric study. Technological Forecasting and Social Change, 146(April), 795–807.

Zuboff, S. (2015). Big other: Surveillance capitalism and the prospects of an information civilization. Journal of Information Technology, 30(1), 75–89.


Licensed under