ML-CryptoSI: A Multilingual Crypto Sentiment Index and its Role in Bitcoin and Ethereum Pricing

Authors

  • Ningyu Zhou Al-Farabi Kazakh National university

DOI:

https://doi.org/10.55927/jfbd.v5i1.6

Keywords:

Cryptocurrency, Sentiment Analysis, Multilingual Text, Bitcoin, Ethereum

Abstract

Cryptocurrency prices often move with narratives and investor sentiment. This paper builds a multilingual crypto sentiment index, ML-CryptoSI, using daily news text in six languages and Binance market data for BTC and ETH. We first aggregate language-level daily sentiment and then use PCA to extract the common component across languages. Next, we test whether ML-CryptoSI predicts next-day returns and volatility proxies after controlling for lagged market conditions, liquidity, and day-of-week fixed effects. The results show that ML-CryptoSI has incremental information for returns, especially for ETH, and the effect is stronger on high news-intensity days. In contrast, the evidence for volatility prediction is weak in this short sample. Overall, the findings suggest that the common factor in multilingual news sentiment matters for short-run crypto pricing and is state dependent.

References

Alnami, H., Mohzary, M., Assiri, B., & Zangoti, H. (2025). An integrated framework for cryptocurrency price forecasting and anomaly detection using machine learning. Applied Sciences, 15(4), 1864. https://doi.org/10.3390/app15041864.

Amberkhani, A., Bolisetty, H., Narasimhaiah, R., Jilani, G., Baheri, B., Muhajab, H., … Shubbar, S. (2025). Revolutionizing cryptocurrency price prediction: Advanced insights from machine learning, deep learning and hybrid models. In K. Arai (Ed.), Advances in Information and Communication: FICC 2025 (Lecture Notes in Networks and Systems, Vol. 1285, pp. 274–286). Springer. https://doi.org/10.1007/978-3-031-84460-7_18.

Aysan, A. F., Caporin, M., & Cepni, O. (2024). Not all words are equal: Sentiment and jumps in the cryptocurrency market. Journal of International Financial Markets, Institutions and Money, 91, 101920. https://doi.org/10.1016/j.intfin.2023.101920.

Barter, T., Gao, Z., Christodoulaki, E., Chen, J., & Cartlidge, J. (2025). BondBERT: What we learn when assigning sentiment in the bond market (arXiv preprint arXiv:2511.01869). https://arxiv.org/abs/2511.01869.

Binance. (n.d.). Binance API documentation. Retrieved September 25, 2025, from https://developers.binance.com/docs/binance-spot-api-docs/rest-api.

BlockBeats. (n.d.). BlockBeats RSS feeds [RSS]. Retrieved September 25, 2025, from https://api.theblockbeats.news/v2/rss/all.

CoinDesk. (n.d.). CoinDesk RSS feed [RSS]. Retrieved September 25, 2025, from https://www.coindesk.com/arc/outboundfeeds/rss.

CoinGecko. (n.d.). CoinGecko API documentation. Retrieved September 25, 2025, from https://docs.coingecko.com.

Cointelegraph. (n.d.). Cointelegraph RSS feeds [RSS]. Retrieved September 25, 2025, from https://cointelegraph.com/rss.

Dias, I. K., Fernando, J. M. R., & Fernando, P. N. D. (2022). Does investor sentiment predict Bitcoin return and volatility? A quantile regression approach. International Review of Financial Analysis, 84, 102383. https://doi.org/10.1016/j.irfa.2022.102383.

Farrugia, F., & Deguara, C. (2025). Sentiment analysis and cryptocurrency price correlation: A data-driven study. MCAST Journal of Applied Research & Practice, 9(2), 165–184. https://journal.mcast.edu.mt/api/files/view/2920104.pdf.

ForkLog. (n.d.). ForkLog [Website]. Retrieved December 14, 2025, from https://forklog.com/en.

Girsang, A. S., & Stanley. (2023). Hybrid LSTM and GRU for cryptocurrency price forecasting based on social network sentiment analysis using FinBERT. IEEE Access, 11, 120530–120540. https://doi.org/10.1109/ACCESS.2023.3324535.

Gurgul, V., Lessmann, S., & Härdle, W. K. (2023). Forecasting cryptocurrency prices using deep learning: Integrating financial, blockchain, and text data [Computer software]. GitHub. https://github.com/Humboldt-WI/CC-Price-Forecasting.

Gurgul, V., Lessmann, S., & Härdle, W. K. (2025). Deep learning and NLP in cryptocurrency forecasting: Integrating financial, blockchain, and social media data. International Journal of Forecasting, 41, 1666–1695. https://doi.org/10.1016/j.ijforecast.2025.02.007.

Hamayel, M. J., & Owda, A. Y. (2021). A novel cryptocurrency price prediction model using GRU, LSTM and bi-LSTM machine learning algorithms. AI, 2(4), 477–496. https://doi.org/10.3390/ai2040030.

Han, S. O. (2025). Investor sentiment and cross-section of cryptocurrency returns. Journal of Behavioral and Experimental Finance, 46, 101043. https://doi.org/10.1016/j.jbef.2025.101043.

Huang, A. H., Wang, H., & Yang, Y. (2023). FinBERT: A large language model for extracting information from financial text. Contemporary Accounting Research, 40(2), 806–841. https://doi.org/10.1111/1911-3846.12832.

Ider, D., & Lessmann, S. (2022). Forecasting cryptocurrency returns from sentiment signals: An analysis of BERT classifiers and weak supervision (arXiv preprint arXiv:2204.05781). https://arxiv.org/abs/2204.05781.

Jin, X., & Lin, S.-L. (2025). An early prediction model on systemic risk under global risk: Using FinBERT and temporal fusion transformer to multimodal data fusion framework. The North American Journal of Economics and Finance, 76, 102361. https://doi.org/10.1016/j.najef.2025.102361.

John, K., Li, J., & Liu, R. (2024). Sentiment in the cross section of cryptocurrency returns (Working paper; CryptoSent index). New York University Stern School of Business; Stevens Institute of Technology. (Working paper landing/info page) https://ronming1303.github.io.

Kleitsikas, C., Korfiatis, N., Leonardos, S., & Ventre, C. (2025). Bitcoin’s edge: Embedded sentiment in blockchain transactional data. In IEEE International Conference on Blockchain and Cryptocurrency (ICBC 2025). (Preprint) https://arxiv.org/abs/2504.13598.

Koutmos, D. (2023). Investor sentiment and Bitcoin prices. Review of Quantitative Finance and Accounting, 60, 1–29. https://doi.org/10.1007/s11156-022-01086-4.

Lupu, R., & Donoiu, P. C. (2025). Sentiment matters for cryptocurrencies: Evidence from tweets. Data, 10(4), 50. https://doi.org/10.3390/data10040050.

Mokni, K. (2022). Investor sentiment and Bitcoin relationship: A quantile-based analysis. North American Journal of Economics and Finance, 60, 101657. https://doi.org/10.1016/j.najef.2021.101657.

Moradi-Kamali, H., Rajabi-Ghozlou, M.-H., Ghazavi, M., Soltani, A., Sattarzadeh, A., & Entezari-Maleki, R. (2025). Market-derived financial sentiment analysis: Context-aware language models for crypto forecasting (arXiv preprint arXiv:2502.14897). https://arxiv.org/abs/2502.14897.

Natzir, S. M., & Jatiprasetya, H. (2025). Prediksi harga cryptocurrency XLM menggunakan metode deep learning LSTM dan GRU [Predicting XLM cryptocurrency prices using LSTM and GRU deep learning models]. HOAQ: Jurnal Teknologi Informasi, 16(1), 49–58. https://doi.org/10.52972/hoaq.vol16no1.p49-58.

Odaily. (n.d.). Odaily RSS feeds [RSS]. Retrieved September 25, 2025, from https://rss.odaily.news/rss/newsflash.

PANews. (n.d.). PANews RSS feed [RSS]. Retrieved September 25, 2025, from https://rss.panewslab.com.

Ponselvakumar, A. P., Giri Shankar, V. P., Iniyan, G., & Logesh, B. (2024). Improving the cryptocurrency price prediction using deep learning. In K. Arai, S. Kapoor, & R. Bhatia (Eds.), Intelligent Systems Design and Applications (ISDA 2023) (Lecture Notes in Networks and Systems, Vol. 1048, pp. 145–153). Springer. https://doi.org/10.1007/978-3-031-64650-8_14.

Seabe, P. L., Moutsinga, C. R. B., & Pindza, E. (2023). Forecasting cryptocurrency prices using LSTM, GRU, and bi-directional LSTM: A deep learning approach. Fractal and Fractional, 7(2), 203. https://doi.org/10.3390/fractalfract7020203.

Seabe, P. L., Moutsinga, C. R. B., & Pindza, E. (2025). Sentiment-driven cryptocurrency forecasting: Analyzing LSTM, GRU, Bi-LSTM, and temporal attention model. Social Network Analysis and Mining, 15, 52. https://doi.org/10.1007/s13278-025-01463-6.

Tiwari, D., Bhati, B. S., Nagpal, B., Al-Rasheed, A., Getahun, M., & Soufiene, B. O. (2025). A swarm-optimization based fusion model of sentiment analysis for cryptocurrency price prediction. Scientific Reports, 15, 8119. https://doi.org/10.1038/s41598-025-92563-y.

Todd, A., Bowden, J., Cummins, M., & Su, Y. (2025). A multimodal sentiment classifier for financial decision making. International Review of Financial Analysis, 105, 104322. https://doi.org/10.1016/j.irfa.2025.104322.

TokenPost. (n.d.). TokenPost RSS feed [RSS]. Retrieved December 14, 2025, from https://www.tokenpost.kr/rss.

Xiao, Y., Sun, E., Luo, D., & Wang, W. (2024). TradingAgents: Multi-agents LLM financial trading framework (arXiv preprint arXiv:2412.20138). https://arxiv.org/abs/2412.20138.

Xu, Z., Wang, L., & Zhou, X. (2025). FinBERT2: Advancing financial language understanding with domain-adaptive large models (arXiv preprint arXiv:2506.06335). https://arxiv.org/abs/2506.06335.

Yamak, P. T., Yujian, L., & Gadosey, P. K. (2019). A comparison between ARIMA, LSTM, and GRU for time series forecasting. In Proceedings of the 2019 2nd International Conference on Algorithms, Computing and Artificial Intelligence (ACAI 2019) (pp. 49–55). Association for Computing Machinery. https://doi.org/10.1145/3514262.3514331.

Zhang, J., Cai, K., & Wen, J. (2024). A survey of deep learning applications in cryptocurrency. iScience, 27(1), 108509. https://doi.org/10.1016/j.isci.2023.108509.

Zou, Y., & Herremans, D. (2023). PreBit — A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin. Expert Systems with Applications, 233, 120838. https://doi.org/10.1016/j.eswa.2023.120838.

Published

2026-03-31

Issue

Section

Articles