Este repositório tem como objetivo hospedar os algoritmos utilizados para o treinamento e validação dos modelos BERT associados ao trabalho 'Modelo para classificação do viés político de postagens de usuários em redes sociais' como Trabalho de Conclusão de Curso na Universidade Federal de Ouro Preto (UFOP).
O trabalho foi realizado com as seguintes tecnologias
- Python
- Jupyter Notebook
- Pandas
- Scikit-learn
- BERT
- NeuralMind BERT
- Hugging Face Transformers
- Google Colab
-
AHARONI, M. When mainstream and alternative media integrate: A polysystem approach to media system interactions. Television & New Media, v. 0, n. 0, p. 15274764221123036, 2021. Disponível em: https://doi.org/10.1177/15274764221123036.
-
ALBANESE, N. C. Fine-Tuning BERT for Text Classification. 2022. Acessado em: 01 Ago. de 2022. Disponível em: https://towardsdatascience.com/ fine-tuning-bert-for-text-classification-54e7df642894.
-
BLEI, D. M.; NG, A. Y.; JORDAN, M. I. Latent dirichlet allocation. J. Mach. Learn. Res., JMLR.org, v. 3, n. null, p. 993–1022, mar 2003. ISSN 1532-4435.
-
BOUDIAF, M.; OUNADI, I.; BENKHELIFA, Y. Tweets categorization using fine-tuned bert model. 1st National Conference on Applied Science and Advanced Materials, 2021. Disponível em: https://www.researchgate.net/profile/Moussa-Hadjer/publication/ 357702629_Tweets_Categorization_using_Fine-tuned_BERT_Model/links/ 61dc1c094e4aff4a642f97cf/Tweets-Categorization-using-Fine-tuned-BERT-Model.pdf.
-
BOYLE, M. P.; SCHMIERBACH, M. Media use and protest: The role of mainstream and alternative media use in predicting traditional and protest participation. Communication Quarterly, Routledge, v. 57, n. 1, p. 1–17, 2009. Disponível em: https://doi.org/10.1080/01463370802662424.
-
BROWNLEE, J. What is the difference between a batch and an epoch in a neural network. Machine Learning Mastery, v. 20, 2018.
-
CERRI, R.; CARVALHO, A. Aprendizado de máquina: breve introdução e aplicações. Cadernos de Ciência & Tecnologia, v. 34, n. 3, p. 297–313, 2017.
-
CHEN, Q.; PENG, Y.; LU, Z. Biosentvec: creating sentence embeddings for biomedical texts. In: 2019 IEEE International Conference on Healthcare Informatics (ICHI). [S.l.: s.n.], 2019. p. 1–5.
-
DEVLIN, J. et al. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805, 2018. Disponível em: http://arxiv.org/abs/1810.04805.
-
FACELI, K. et al. Inteligência artificial: uma abordagem de aprendizado de máquina. [S.l.]: LTC, 2011.
-
GANDOMI, A. H.; CHEN, F.; ABUALIGAH, L. Machine learning technologies for big data analytics. Electronics, v. 11, n. 3, 2022. ISSN 2079-9292. Disponível em: https://www.mdpi.com/2079-9292/11/3/421.
-
GARIMELLA, K. Polarization on Social Media. In: . Aalto University, 2018. (Aalto University publication series DOCTORAL DISSERTATIONS; 20/2018), p. 67 + app. 127. ISBN 978-952-60-7833-5 (electronic), 978-952-60-7832-8 (printed). ISSN 1799-4942 (electronic), 1799-4934 (printed), 1799-4934 (ISSN-L). Disponível em: http://urn.fi/URN:ISBN:978-952-60-7833-5.
-
GEETHA, M.; Karthika Renuka, D. Improving the performance of aspect based sentiment analysis using fine-tuned bert base uncased model. International Journal of Intelligent Networks, v. 2, p. 64–69, 2021. ISSN 2666-6030. Disponível em: https://www.sciencedirect.com/science/article/pii/S2666603021000129.
-
GÉRON, A. Mãos à Obra: Aprendizado de Máquina com Scikit-Learn & TensorFlow. [S.l.]: Alta Books, 2019.
-
GOES, L. T. d. Contra-hegemonia e internet: Gramsci e a mídia alternativa dos movimentos sociais na web. IX Congresso Brasileiro de Ciências da Comunicação da Região Nordeste – Salvador – BA, 2022.
-
HAGEN, L. Content analysis of e-petitions with topic modeling: How to train and evaluate lda models? Information Processing Management, v. 54, n. 6, p. 1292–1307, 2018. ISSN 0306-4573. Disponível em: https://www.sciencedirect.com/science/article/ pii/S0306457317307240.
-
HALLER, A.; HOLT, K. Paradoxical populism: how pegida relates to mainstream and alternative media. Information, Communication & Society, Routledge, v. 22, n. 12, p. 1665–1680, 2019. Disponível em: https://doi.org/10.1080/1369118X.2018.1449882.
-
HAMMES, L. O. A.; FREITAS, L. A. de. Utilizando bertimbau para a classificação de emoções em português. In: SBC. Anais do XIII Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana. [S.l.], 2021. p. 56–63.
-
HARCUP, T. “i’m doing this to change the world”: journalism in alternative and mainstream media. Journalism Studies, Routledge, v. 6, n. 3, p. 361–374, 2005. Disponível em: https://doi.org/10.1080/14616700500132016.
-
IHLEBÆK, K. A.; NYGAARD, S. Right-wing alternative media in the scandinavian political communication landscape. Nordicom, University of Gothenburg, 2021.
-
ISHIDA, T. et al. Do we need zero training loss after achieving zero training error? CoRR, abs/2002.08709, 2020. Disponível em: https://arxiv.org/abs/2002.08709.
-
KANG, H. J.; KIM, C.; KANG, K. Analysis of the trends in biochemical research using latent dirichlet allocation (lda). Processes, v. 7, n. 6, 2019. ISSN 2227-9717. Disponível em: https://www.mdpi.com/2227-9717/7/6/379.
-
KINGMA, D. P.; BA, J. Adam: A Method for Stochastic Optimization. arXiv, 2014. Disponível em: https://arxiv.org/abs/1412.6980.
-
LI, B.; HAN, L. Distance weighted cosine similarity measure for text classification. In: YIN, H. et al. (Ed.). Intelligent Data Engineering and Automated Learning – IDEAL 2013. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. p. 611–618.
-
MONARD, M. C.; PRATI, R. C. Aprendizado de máquina simbólico para mineração de dados. In: XIII Escola Regional de Informática da SBC - Santa Catarina. 1. ed. Florianópolis: Sociedade Brasileira de Computação, 2005. p. 1–26. ISBN 8576690349. Disponível em: http://www.labic.icmc.usp.br/pub/mcmonard/eri05book.pdf.
-
NYGAARD, S. On the mainstream/alternative continuum: Mainstream media reactions to right-wing alternative news media. Digital Journalism, Routledge, v. 0, n. 0, p. 1–17, 2021. Disponível em: https://doi.org/10.1080/21670811.2021.1894962.
-
OLIVEIRA, B. S. N. et al. Processamento de linguagem natural via aprendizagem profunda. Sociedade Brasileira de Computação, 2022.
-
PEZOA, F. et al. Foundations of json schema. In: Proceedings of the 25th International Conference on World Wide Web. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee, 2016. (WWW ’16), p. 263–273. ISBN 9781450341431. Disponível em: https://doi.org/10.1145/2872427.2883029.
-
PIMENTEL, J. F. et al. Ciência de dados com reprodutibilidade usando jupyter. Sociedade Brasileira de Computação, 2021.
-
RASCHKA, S. An overview of general performance metrics of binary classifier systems. arXiv preprint arXiv:1410.5330, 2014.
-
RAUCH, J. Exploring the Alternative–Mainstream Dialectic: What “Alternative Media” Means to a Hybrid Audience*. Communication, Culture and Critique, v. 8, n. 1, p. 124–143, 09 2014. ISSN 1753-9129. Disponível em: https://doi.org/10.1111/cccr.12068.
-
RAVICHANDIRAN, S. Getting Started with Google BERT: Build and train state-of-the-art natural language processing models using BERT. 1th. ed. [S.l.]: Packt Publishing, 2021.
-
REIMERS, N.; GUREVYCH, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. 2019. ArXiv cs.CL 1908.10084.
-
ROSA, J. L. G. Fundamentos da inteligência artificial. Rio de Janeiro: LTC, v. 1, 2011.
-
SARICA, S.; LUO, J. Stopwords in technical language processing. PLOS ONE, Public Library of Science, v. 16, n. 8, p. 1–13, 08 2021. Disponível em: https://doi.org/10.1371/journal.pone.0254937.
-
SHAHREZAYE, M. et al. Estimating the political orientation of twitter users in homophilic networks. In: AAAI Spring Symposium: Interpretable AI for Well-being. [S.l.: s.n.], 2019.
-
SIMÕES, A.; CASTAÑOS. Fine-tuned bert for the detection of political ideology. In: STANFORD. Stanford CS224N Custom Project. 2020. Disponível em: https:// web.stanford.edu/class/archive/cs/cs224n/cs224n.1204/reports/custom/report43.pdf.
-
SOUZA, F.; NOGUEIRA, R.; LOTUFO, R. Bertimbau: Pretrained bert models for brazilian portuguese. In: CERRI, R.; PRATI, R. C. (Ed.). Intelligent Systems. Cham: Springer International Publishing, 2020. p. 403–417. ISBN 978-3-030-61377-8.
-
SUN, X. et al. Sentence Similarity Based on Contexts. 2021. ArXiv cs.CL 2105.07623.
-
TEIXEIRA, L. M. Ativismo em rede: crítica das mídias alternativas à atual política de brasil, espanha e portugal. 2022. Disponível em: https://hdl.handle.net/1822/77937. Acesso em: 01 out. 2022.
-
TENNEY, I.; DAS, D.; PAVLICK, E. BERT rediscovers the classical NLP pipeline. CoRR, abs/1905.05950, 2019. Disponível em: http://arxiv.org/abs/1905.05950.
-
VASCONCELOS, M. et al. Identifying and characterizing alternative news media on facebook. In: 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). [S.l.: s.n.], 2020. p. 448–452.
-
VRIES, W. de et al. Bertje: A dutch BERT model. CoRR, abs/1912.09582, 2019. Disponível em: http://arxiv.org/abs/1912.09582.
-
VUCETIC, D. et al. Efficient fine-tuning of bert models on the edge. In: . [S.l.: s.n.], 2022.
-
WANG, M.; HU, F. The application of nltk library for python natural language processing in corpus research. Theory and Practice in Language Studies, v. 11, n. 9, p. 1041–1049, 2021.
-
YIRAN, Y.; SRIVASTAVA, S. Aspect-based sentiment analysis on mobile phone reviews with lda. In: . New York, NY, USA: Association for Computing Machinery, 2019. (ICMLT 2019), p. 101–105. ISBN 9781450363235. Disponível em: https://doi.org/10.1145/3340997.3341012.