Digitala Vetenskapliga Arkivet

Change search
Refine search result
1 - 45 of 45
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Adewumi, Oluwatosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Brännvall, Rickard
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. RISE Research Institutes of Sweden.
    Abid, Nosheen
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Pahlavan, Maryam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Sabah Sabry, Sana
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning2022In: Proceedings of the Northern Lights Deep Learning Workshop 2022 / [ed] Sigurd Løkse, Benjamin Ricaud, Septentrio Academic Publishing , 2022, Vol. 3Conference paper (Refereed)
    Abstract [en]

    Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English.This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources: Reddit, Familjeliv and the GDC. Perplexity score (an automated intrinsic metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models. We also compare the DialoGPT experiments with an attention-mechanism-based seq2seq baseline model, trained on the GDC dataset. The results indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogues judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. The work agrees with the hypothesis that deep monolingual models learn some abstractions which generalize across languages. We contribute the codes, datasets and model checkpoints and host the demos on the HuggingFace platform.

  • 2.
    Adewumi, Oluwatosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Conversational Systems in Machine Learning from the Point of View of the Philosophy of Science—Using Alime Chat and Related Studies2019In: Philosophies, ISSN 2409-9287, Vol. 4, no 3, article id 41Article in journal (Refereed)
    Abstract [en]

    This essay discusses current research efforts in conversational systems from the philosophy of science point of view and evaluates some conversational systems research activities from the standpoint of naturalism philosophical theory. Conversational systems or chatbots have advanced over the decades and now have become mainstream applications. They are software that users can communicate with, using natural language. Particular attention is given to the Alime Chat conversational system, already in industrial use, and the related research. The competitive nature of systems in production is a result of different researchers and developers trying to produce new conversational systems that can outperform previous or state-of-the-art systems. Different factors affect the quality of the conversational systems produced, and how one system is assessed as being better than another is a function of objectivity and of the relevant experimental results. This essay examines the research practices from, among others, Longino’s view on objectivity and Popper’s stand on falsification. Furthermore, the need for qualitative and large datasets is emphasized. This is in addition to the importance of the peer-review process in scientific publishing, as a means of developing, validating, or rejecting theories, claims, or methodologies in the research community. In conclusion, open data and open scientific discussion fora should become more prominent over the mere publication-focused trend.

  • 3.
    Adewumi, Oluwatosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Corpora Compared: The Case of the Swedish Gigaword & Wikipedia Corpora2020Conference paper (Refereed)
    Abstract [en]

    In this work, we show that the difference in performance of embeddings from differently sourced data for a given language can be due to other factors besides data size. Natural language processing (NLP) tasks usually perform better with embeddings from bigger corpora. However, broadness of covered domain and noise can play important roles. We evaluate embeddings based on two Swedish corpora: The Gigaword and Wikipedia, in analogy (intrinsic) tests and discover that the embeddings from the Wikipedia corpus generally outperform those from the Gigaword corpus, which is a bigger corpus. Downstream tests will be required to have a definite evaluation.

    Download full text (pdf)
    fulltext
  • 4.
    Adewumi, Oluwatosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Exploring Swedish & English fastText Embeddings2022In: Artificial Intelligence and Cognition 2022: Proceedings of the 8th International Workshop on Artificial Intelligence and Cognition / [ed] Hadi Banaee, Amy Loutfi, Alessandro Saffiotti, Antonio Lieto, 2022, Vol. 3400, p. 201-208Conference paper (Refereed)
    Abstract [en]

    In this paper, we show that embeddings from relatively smaller corpora sometimes outperform thosefrom larger corpora and we introduce a new Swedish analogy test set and make it publicly available.To achieve good performance in Natural Language Processing (NLP) downstream tasks, several factorsplay important roles: dataset size, the right hyper-parameters, and well-trained embeddings. We utilizethe fastText tool for our experiments. We evaluate both the Swedish and English embeddings that wecreated using intrinsic evaluation (including analogy & Spearman correlation) and compare them with2 common, publicly available embeddings. Our English continuous Bag-of-Words (CBoW)-negativesampling embedding shows better performance compared to the publicly available GoogleNews version.We also describe the relationship between NLP and cognitive science. We contribute the embeddings forresearch or other useful purposes by publicly releasing them.

    Download full text (pdf)
    fulltext
  • 5.
    Adewumi, Oluwatosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Exploring Swedish & English fastText Embeddings for NER with the TransformerManuscript (preprint) (Other academic)
    Abstract [en]

    In this paper, our main contributions are that embeddings from relatively smaller corpora can outperform ones from far larger corpora and we present the new Swedish analogy test set. To achieve a good network performance in natural language processing (NLP) downstream tasks, several factors play important roles: dataset size, the right hyper-parameters, and well-trained embeddings. We show that, with the right set of hyper-parameters, good network performance can be reached even on smaller datasets. We evaluate the embeddings at the intrinsic level and extrinsic level, by deploying them on the Transformer in named entity recognition (NER) task and conduct significance tests. This is done for both Swedish and English. We obtain better performance in both languages on the downstream task with far smaller training data, compared to recently released, common crawl versions; and character n-grams appear useful for Swedish, a morphologically rich language.

  • 6.
    Adewumi, Oluwatosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Vector Representations of Idioms in Conversational Systems2022In: Sci, E-ISSN 2413-4155, Vol. 4, no 4, article id 37Article in journal (Refereed)
    Abstract [en]

    In this study, we demonstrate that an open-domain conversational system trained on idioms or figurative language generates more fitting responses to prompts containing idioms. Idioms are a part of everyday speech in many languages and across many cultures, but they pose a great challenge for many natural language processing (NLP) systems that involve tasks such as information retrieval (IR), machine translation (MT), and conversational artificial intelligence (AI). We utilized the Potential Idiomatic Expression (PIE)-English idiom corpus for the two tasks that we investigated: classification and conversation generation. We achieved a state-of-the-art (SoTA) result of a 98% macro F1 score on the classification task by using the SoTA T5 model. We experimented with three instances of the SoTA dialogue model—the Dialogue Generative Pre-trained Transformer (DialoGPT)—for conversation generation. Their performances were evaluated by using the automatic metric, perplexity, and a human evaluation. The results showed that the model trained on the idiom corpus generated more fitting responses to prompts containing idioms 71.9% of the time in comparison with a similar model that was not trained on the idiom corpus. We have contributed the model checkpoint/demo/code to the HuggingFace hub for public access.

  • 7.
    Adewumi, Oluwatosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Word2Vec: Optimal hyperparameters and their impact on natural language processing downstream tasks2022In: Open Computer Science, E-ISSN 2299-1093, Vol. 12, no 1, p. 134-141Article in journal (Refereed)
    Abstract [en]

    Word2Vec is a prominent model for natural language processing tasks. Similar inspiration is found in distributed embeddings (word-vectors) in recent state-of-the-art deep neural networks. However, wrong combination of hyperparameters can produce embeddings with poor quality. The objective of this work is to empirically show that Word2Vec optimal combination of hyper-parameters exists and evaluate various combinations. We compare them with the publicly released, original Word2Vec embedding. Both intrinsic and extrinsic (downstream) evaluations are carried out, including named entity recognition and sentiment analysis. Our main contributions include showing that the best model is usually task-specific, high analogy scores do not necessarily correlate positively with F1 scores, and performance is not dependent on data size alone. If ethical considerations to save time, energy, and the environment are made, then relatively smaller corpora may do just as well or even better in some cases. Increasing the dimension size of embeddings after a point leads to poor quality or performance. In addition, using a relatively small corpus, we obtain better WordSim scores, corresponding Spearman correlation, and better downstream performances (with significance tests) compared to the original model, which is trained on a 100 billion-word corpus.

  • 8.
    Adewumi, Oluwatosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Word2Vec: Optimal Hyper-Parameters and Their Impact on NLP Downstream TasksManuscript (preprint) (Other academic)
    Abstract [en]

    Word2Vec is a prominent model for natural language processing (NLP) tasks. Similar nspiration is found in distributed embeddings for new state-of-the-art (SotA) deep neural networks.  However, wrong combination of hyper-parameters can produce poor quality vectors. The objective of this work is to empirically show optimal combination of hyper-parameters exists and evaluate various combinations. We compare them with the released, pre-trained original word2vec model. Both intrinsic and extrinsic (downstream) evaluations, including named entity recognition (NER) and sentiment analysis (SA) were carried out. The downstream tasks reveal that the best model is usually task-specific, high analogy scores don’t necessarily correlate positively with F1 scores and the same applies to focus on data alone. Increasing vector dimension size after a point leads to poor quality or performance. If ethical considerations to save time, energy and the environment are made, then reasonably smaller corpora may do just as well or even better in some cases. Besides, using a small corpus, we obtain better human-assigned WordSim scores, corresponding Spearman correlation and better downstream performances (with significance tests) compared to the original model, trained on 100 billion-word corpus.

  • 9.
    Adewumi, Oluwatosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Sabry, Sana Sabah
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Abid, Nosheen
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    T5 for Hate Speech, Augmented Data, and Ensemble2023In: Sci, E-ISSN 2413-4155, Vol. 5, no 4, article id 37Article in journal (Refereed)
    Abstract [en]

    We conduct relatively extensive investigations of automatic hate speech (HS) detection using different State-of-The-Art (SoTA) baselines across 11 subtasks spanning six different datasets. Our motivation is to determine which of the recent SoTA models is best for automatic hate speech detection and what advantage methods, such as data augmentation and ensemble, may have on the best model, if any. We carry out six cross-task investigations. We achieve new SoTA results on two subtasks—macro F1 scores of 91.73% and 53.21% for subtasks A and B of the HASOC 2020 dataset, surpassing previous SoTA scores of 51.52% and 26.52%, respectively. We achieve near-SoTA results on two others—macro F1 scores of 81.66% for subtask A of the OLID 2019 and 82.54% for subtask A of the HASOC 2021, in comparison to SoTA results of 82.9% and 83.05%, respectively. We perform error analysis and use two eXplainable Artificial Intelligence (XAI) algorithms (Integrated Gradient (IG) and SHapley Additive exPlanations (SHAP)) to reveal how two of the models (Bi-Directional Long Short-Term Memory Network (Bi-LSTM) and Text-to-Text-Transfer Transformer (T5)) make the predictions they do by using examples. Other contributions of this work are: (1) the introduction of a simple, novel mechanism for correcting Out-of-Class (OoC) predictions in T5, (2) a detailed description of the data augmentation methods, and (3) the revelation of the poor data annotations in the HASOC 2021 dataset by using several examples and XAI (buttressing the need for better quality control). We publicly release our model checkpoints and codes to foster transparency.

    Download full text (pdf)
    fulltext
  • 10.
    Adewumi, Tosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. Masakhane.
    Adeyemi, Mofetoluwa
    Masakhane.
    Anuoluwapo, Aremu
    Masakhane.
    Peters, Bukola
    CIS.
    Buzaaba, Happy
    Masakhane.
    Samuel, Oyerinde
    Masakhane.
    Rufai, Amina Mardiyyah
    Masakhane.
    Ajibade, Benjamin
    Masakhane.
    Gwadabe, Tajudeen
    Masakhane.
    Koulibaly Traore, Mory Moussou
    Masakhane.
    Ajayi, Tunde Oluwaseyi
    Masakhane.
    Muhammad, Shamsuddeen
    Baruwa, Ahmed
    Masakhane.
    Owoicho, Paul
    Masakhane.
    Ogunremi, Tolulope
    Masakhane.
    Ngigi, Phylis
    Jomo Kenyatta University of Agriculture and Technology.
    Ahia, Orevaoghene
    Masakhane.
    Nasir, Ruqayya
    Masakhane.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    AfriWOZ: Corpus for Exploiting Cross-Lingual Transfer for Dialogue Generation in Low-Resource, African Languages2023In: IJCNN 2023 - International Joint Conference on Neural Networks, Conference Proceedings, Institute of Electrical and Electronics Engineers Inc. , 2023Conference paper (Refereed)
    Abstract [en]

    Dialogue generation is an important NLP task fraught with many challenges. The challenges become more daunting for low-resource African languages. To enable the creation of dialogue agents for African languages, we contribute the first high-quality dialogue datasets for 6 African languages: Swahili, Wolof, Hausa, Nigerian Pidgin English, Kinyarwanda & Yorùbá. There are a total of 9,000 turns, each language having 1,500 turns, which we translate from a portion of the English multi-domain MultiWOZ dataset. Subsequently, we benchmark by investigating & analyzing the effectiveness of modelling through transfer learning by utilziing state-of-the-art (SoTA) deep monolingual models: DialoGPT and BlenderBot. We compare the models with a simple seq2seq baseline using perplexity. Besides this, we conduct human evaluation of single-turn conversations by using majority votes and measure inter-annotator agreement (IAA). We find that the hypothesis that deep monolingual models learn some abstractions that generalize across languages holds. We observe human-like conversations, to different degrees, in 5 out of the 6 languages. The language with the most transferable properties is the Nigerian Pidgin English, with a human-likeness score of 78.1%, of which 34.4% are unanimous. We freely provide the datasets and host the model checkpoints/demos on the HuggingFace hub for public access.

  • 11.
    Adewumi, Tosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Alkhaled, Lama
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    ML_LTU at SemEval-2022 Task 4: T5 Towards Identifying Patronizingand Condescending Language2022In: Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) / [ed] Guy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan, Association for Computational Linguistics , 2022, p. 473-478Conference paper (Refereed)
    Abstract [en]

    This paper describes the system used by the Machine Learning Group of LTU in subtask 1 of the SemEval-2022 Task 4: Patronizing and Condescending Language (PCL) Detection. Our system consists of finetuning a pretrained text-to-text transfer transformer (T5) and innovatively reducing its out-of-class predictions. The main contributions of this paper are 1) the description of the implementation details of the T5 model we used, 2) analysis of the successes & struggles of the model in this task, and 3) ablation studies beyond the official submission to ascertain the relative importance of data split. Our model achieves an F1 score of 0.5452 on the official test set.

  • 12.
    Adewumi, Tosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    State-of-the-Art in Open-Domain Conversational AI: A Survey2022In: Information, E-ISSN 2078-2489, Vol. 13, no 6, article id 298Article, review/survey (Refereed)
    Abstract [en]

    We survey SoTA open-domain conversational AI models with the objective of presenting the prevailing challenges that still exist to spur future research. In addition, we provide statistics on the gender of conversational AI in order to guide the ethics discussion surrounding the issue. Open-domain conversational AI models are known to have several challenges, including bland, repetitive responses and performance degradation when prompted with figurative language, among others. First, we provide some background by discussing some topics of interest in conversational AI. We then discuss the method applied to the two investigations carried out that make up this study. The first investigation involves a search for recent SoTA open-domain conversational AI models, while the second involves the search for 100 conversational AI to assess their gender. Results of the survey show that progress has been made with recent SoTA conversational AI, but there are still persistent challenges that need to be solved, and the female gender is more common than the male for conversational AI. One main takeaway is that hybrid models of conversational AI offer more advantages than any single architecture. The key contributions of this survey are (1) the identification of prevailing challenges in SoTA open-domain conversational AI, (2) the rarely held discussion on open-domain conversational AI for low-resource languages, and (3) the discussion about the ethics surrounding the gender of conversational AI.

  • 13.
    Adewumi, Tosin P.
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    The Challenge of Diacritics in Yorùbá Embeddings2020In: ML4D 2020 Proceedings / [ed] Tejumade Afonja; Konstantin Klemmer; Aya Salama; Paula Rodriguez Diaz; Niveditha Kalavakonda; Oluwafemi Azeez, Neural Information Processing Systems Foundation , 2020, article id 2011.07605Conference paper (Refereed)
    Abstract [en]

    The major contributions of this work include the empirical establishment of a better performance for Yoruba embeddings from undiacritized (normalized) dataset and provision of new analogy sets for evaluation.The Yoruba language, being a tonal language, utilizes diacritics (tonal marks) in written form. We show that this affects embedding performance by creating embeddings from exactly the same Wikipedia dataset but with the second one normalized to be undiacritized. We further compare average intrinsic performance with two other work (using analogy test set & WordSim) and we obtain the best performance in WordSim and corresponding Spearman correlation.

    Download full text (pdf)
    fulltext
  • 14.
    Adewumi, Tosin P.
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Vector Representations of Idioms in Chatbots2020In: Proceedings: SAIS Workshop 2020, Chalmers University of Technology , 2020Conference paper (Refereed)
    Abstract [en]

    Open-domain chatbots have advanced but still have many gaps. My PhD aims to solve a few of those gaps by creating vector representations of idioms (figures of speech) that will be beneficial to chatbots and natural language processing (NLP), generally. In the process, new, optimal fastText embeddings in Swedish and English have been created and the first Swedish analogy test set, larger than the Google original, for intrinsic evaluation of Swedish embeddings has also been produced. Major milestones have been attained and others are soon to follow. The deliverables of this project will give NLP researchers the opportunity to measure the quality of Swedish embeddings easily and advance state-of-the-art (SotA) in NLP.

    Download full text (pdf)
    fulltext
  • 15.
    Adewumi, Tosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Södergren, Isabella
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Digital Services and Systems.
    Alkhaled, Lama
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Sabry, Sana Sabah
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Bipol: Multi-axes Evaluation of Bias with Explainability in BenchmarkDatasets2023In: Proceedings of Recent Advances in Natural Language Processing / [ed] Galia Angelova, Maria Kunilovskaya and Ruslan Mitkov, Incoma Ltd. , 2023, p. 1-10Conference paper (Refereed)
    Abstract [en]

    We investigate five English NLP benchmark datasets (on the superGLUE leaderboard) and two Swedish datasets for bias, along multiple axes. The datasets are the following: Boolean Question (Boolq), CommitmentBank (CB), Winograd Schema Challenge (WSC), Winogender diagnostic (AXg), Recognising Textual Entailment (RTE), Swedish CB, and SWEDN. Bias can be harmful and it is known to be common in data, which ML models learn from. In order to mitigate bias in data, it is crucial to be able to estimate it objectively. We use bipol, a novel multi-axes bias metric with explainability, to estimate and explain how much bias exists in these datasets. Multilingual, multi-axes bias evaluation is not very common. Hence, we also contribute a new, large Swedish bias-labeled dataset (of 2 million samples), translated from the English version and train the SotA mT5 model on it. In addition, we contribute new multi-axes lexica for bias detection in Swedish. We make the codes, model, and new dataset publicly available.

  • 16.
    Adewumi, Tosin
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Vadoodi, Roshanak
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Tripathy, Aparajita
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Nikolaidou, Konstantina
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Potential Idiomatic Expression (PIE)-English: Corpus for Classes of Idioms2022In: Proceedings of the 13th Language Resources and Evaluation Conference / [ed] Nicoletta Calzolari; Frédéric Béchet; Philippe Blache; Khalid Choukri; Christopher Cieri; Thierry Declerck; Sara Goggi; Hitoshi Isahara; Bente Maegaard; Joseph Mariani; Hélène Mazo; Jan Odijk; Stelios Piperidis, European Language Resources Association (ELRA) , 2022, p. 689-696Conference paper (Refereed)
    Abstract [en]

    We present a fairly large, Potential Idiomatic Expression (PIE) dataset for Natural Language Processing (NLP) in English. The challenges with NLP systems with regards to tasks such as Machine Translation (MT), word sense disambiguation (WSD) and information retrieval make it imperative to have a labelled idioms dataset with classes such as it is in this work. To the best of the authors’ knowledge, this is the first idioms corpus with classes of idioms beyond the literal and the general idioms classification. Inparticular, the following classes are labelled in the dataset: metaphor, simile, euphemism, parallelism, personification, oxymoron, paradox, hyperbole, irony and literal. We obtain an overall inter-annotator agreement (IAA) score, between two independent annotators, of 88.89%. Many past efforts have been limited in the corpus size and classes of samples but this dataset contains over 20,100 samples with almost 1,200 cases of idioms (with their meanings) from 10 classes (or senses). The corpus may also be extended by researchers to meet specific needs. The corpus has part of speech (PoS) tagging from the NLTK library. Classification experiments performed on the corpus to obtain a baseline and comparison among three common models, including the state-of-the-art (SoTA) BERT model, give good results. We also make publicly available the corpus and the relevant codes for working with it for NLP tasks.

  • 17.
    Al-Azzawi, Sana Sabah Sabry
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chronéer, Diana
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Digital Services and Systems.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Innovative Education Approach Toward Active Distance Education: a Case Study in the Introduction to AI course2022In: Conference Proceedings. The Future of Education 2022, 2022Conference paper (Refereed)
    Abstract [en]

    In this paper, we first describe various synchronous and asynchronous methods for enhancing student engagement in big online courses. We showcase the implementation of these methods in the “Introduction to Artificial Intelligence (AI)” course at Luleå University of Technology, which has attracted around 500 students in each of its iterations (twice yearly, since 2019). We also show that these methods can be applied efficiently, in terms of the teaching hours required. With the increase in digitization and student mobility, the demand for improved and personalized content delivery for distance education has also increased. This applies not only in the context of traditional undergraduate education, but also in the context of adult education and lifelong learning. This higher level of demand, however, introduces a challenge, especially as it is typically combined with a shortage of staff and needs for efficient education. This challenge is further amplified by the current pandemic situation, which led to an even bigger risk of student-dropout. To mitigate this risk, as well as to meet the increased demand, we applied various methods for creating engaging interaction in our pedagogy based on Moor’s framework: learner-to-learner, learner-to-instructor, and learner-to-content engagement strategies. The main methods of this pedagogy are as follows: short, and interactive videos, active discussions in topic-based forums, regular live sessions with group discussions, and the introduction of optional content at many points in the course, to address different target groups. In this paper, we show how we originally designed and continuously improved the course, without requiring more than 500 teaching hours per iteration (one hour per enrolled student), while we also managed to increase the successful completion rate of the participants by 10%, and improved student engagement and feedback for the course by 50%. We intend to share a set of best-practices applicable to many other e-learning courses in ICT.

    Download full text (pdf)
    fulltext
  • 18.
    Günther, Christian
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Jansson, Nils
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Towards a Machine Learning Framework for Drill Core Analysis2021In: 2021 Swedish Artificial Intelligence Society Workshop (SAIS), IEEE, 2021, p. 19-24Conference paper (Refereed)
    Abstract [en]

    This paper discusses existing methods for geological analysis of drill cores and describes the research and development directions of a machine learning framework for such a task. Drill core analysis is one of the first steps of the mining value chain. Such analysis incorporates a high complexity of input features (visual and compositional) derived from multiple sources and commonly by multiple observers. Especially the huge amount of visual information available from the drill core can provide valuable insights, but due to the complexity of many geological materials, automated data acquisition is difficult. This paper (i) describes the difficulty of drill core analysis, (ii) discusses common approaches and recent machine learning-based approaches to address the issues towards automation, and finally, (iii) proposes a machine learning-based framework for drill core analysis which is currently in development. The first major component, the registration of the drill core image for further processing, is presented in detail and evaluated on a dataset of 180 drill core images. We furthermore investigate the amount of labelled data required to automate the drill core analysis. As an interesting outcome, already a few labelled images led to an average precision (AP) of around 80%, which indicates that the manual drill core analysis can be made more efficient with the support of a Machine Learning/labeling workflow.

  • 19.
    Javed, Saleha
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Adewumi, Oluwatosin
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Understanding the Role of Objectivity in Machine Learning and Research Evaluation2021In: Philosophies, ISSN 2409-9287, Vol. 6, no 1, article id 22Article in journal (Refereed)
    Abstract [en]

    This article makes the case for more objectivity in Machine Learning (ML) research. Any research work that claims to hold benefits has to be scrutinized based on many parameters, such as the methodology employed, ethical considerations and its theoretical or technical contribution. We approach this discussion from a Naturalist philosophical outlook. Although every analysis may be subjective, it is important for the research community to keep vetting the research for continuous growth and to produce even better work. We suggest standardizing some of the steps in ML research in an objective way and being aware of various biases threatening objectivity. The ideal of objectivity keeps research rational since objectivity requires beliefs to be based on facts. We discuss some of the current challenges, the role of objectivity in the two elements (product and process) that are up for consideration in ML and make recommendations to support the research community.

  • 20.
    Liwicki, Foteini
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rethinking the Methods and Algorithms for Inner Speech Decoding and Making Them Reproducible2022In: NeuroSci, ISSN 2673-4087, Vol. 3, no 2, p. 226-244Article in journal (Refereed)
    Abstract [en]

    This study focuses on the automatic decoding of inner speech using noninvasive methods, such as Electroencephalography (EEG). While inner speech has been a research topic in philosophy and psychology for half a century, recent attempts have been made to decode nonvoiced spoken words by using various brain–computer interfaces. The main shortcomings of existing work are reproducibility and the availability of data and code. In this work, we investigate various methods (using Convolutional Neural Network (CNN), Gated Recurrent Unit (GRU), Long Short-Term Memory Networks (LSTM)) for the detection task of five vowels and six words on a publicly available EEG dataset. The main contributions of this work are (1) subject dependent vs. subject-independent approaches, (2) the effect of different preprocessing steps (Independent Component Analysis (ICA), down-sampling and filtering), and (3) word classification (where we achieve state-of-the-art performance on a publicly available dataset). Overall we achieve a performance accuracy of 35.20% and 29.21% when classifying five vowels and six words, respectively, in a publicly available dataset, using our tuned iSpeech-CNN architecture. All of our code and processed data are publicly available to ensure reproducibility. As such, this work contributes to a deeper understanding and reproducibility of experiments in the area of inner speech detection.

    Download full text (pdf)
    fulltext
  • 21.
    Liwicki, Foteini Simistira
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Deep learning for historical document anlysis2020In: Handbook Of Pattern Recognition And Computer Vision / [ed] C. H. Chen, World Scientific, 2020, 6, p. 287-303Chapter in book (Other academic)
    Abstract [en]

    This chapter gives an overview of the state of the art and recent methods in the area of historical document analysis. Historical documents differ from the ordinary documents due to the presence of different artifacts. Issues such as poor conditions of the documents, texture, noise and degradation, large variability of page layout, page skew, random alignment, variety of fonts, presence of embellishments, variations in spacing between characters, words, lines, paragraphs and margins, overlapping object boundaries, superimposition of information layers, etc bring complexity issues in analyzing them. Most methods currently rely on deep learning based methods, including Convolutional Neural Networks and Long Short-Term Memory Networks. In addition to the overview of the state of the art, this chapter describes a recently introduced idea for the detection of graphical elements in historical documents and an ongoing effort towards the creation of large database.

    Download (pdf)
    attachment
  • 22.
    Liwicki, Foteini
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Murphy, Killian
    SAMOVAR laboratory, Telecom SudParis, Institut Polytechnique de Paris, France.
    Visi, Federico
    Luleå University of Technology, Department of Social Sciences, Technology and Arts, Music, Media and Theater.
    Östersjö, Stefan
    Luleå University of Technology, Department of Social Sciences, Technology and Arts, Music, Media and Theater.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Deep Neural Network approaches for Analysing Videos of Music PerformancesManuscript (preprint) (Other academic)
  • 23.
    Mishra, Ashish Ranjan
    et al.
    Department of Computer Science and Engineering, Madan Mohan Malaviya University of Technology, Gorakhpur, UP, India.
    Kumar, Rakesh
    Department of Computer Science and Engineering, Madan Mohan Malaviya University of Technology, Gorakhpur, UP, India.
    Gupta, Vibha
    Department of Molecular and Clinical Medicine, University of Gothenburg, Gothenburg, Sweden.
    Prabhu, Sameer
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Operation, Maintenance and Acoustics.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Das Chakladar, Debashis
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Department of Video Communication and Applications, Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, Germany.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Simistira Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    SignEEG v1.0: Multimodal Dataset with Electroencephalography and Hand-written Signature for Biometric Systems2024In: Scientific Data, E-ISSN 2052-4463, Vol. 11, article id 718Article in journal (Refereed)
    Abstract [en]

    Handwritten signatures in biometric authentication leverage unique individual characteristics for identification, offering high specificity through dynamic and static properties. However, this modality faces significant challenges from sophisticated forgery attempts, underscoring the need for enhanced security measures in common applications. To address forgery in signature-based biometric systems, integrating a forgery-resistant modality, namely, noninvasive electroencephalography (EEG), which captures unique brain activity patterns, can significantly enhance system robustness by leveraging multimodality’s strengths. By combining EEG, a physiological modality, with handwritten signatures, a behavioral modality, our approach capitalizes on the strengths of both, significantly fortifying the robustness of biometric systems through this multimodal integration. In addition, EEG’s resistance to replication offers a high-security level, making it a robust addition to user identification and verification. This study presents a new multimodal SignEEG v1.0 dataset based on EEG and hand-drawn signatures from 70 subjects. EEG signals and hand-drawn signatures have been collected with Emotiv Insight and Wacom One sensors, respectively. The multimodal data consists of three paradigms based on mental, & motor imagery, and physical execution: i) thinking of the signature’s image, (ii) drawing the signature mentally, and (iii) drawing a signature physically. Extensive experiments have been conducted to establish a baseline with machine learning classifiers. The results demonstrate that multimodality in biometric systems significantly enhances robustness, achieving high reliability even with limited sample sizes. We release the raw, pre-processed data and easy-to-follow implementation details.

    Download full text (pdf)
    fulltext
  • 24.
    Mishra, Ashish Ranjan
    et al.
    Rajkiya Engineering College Sonbhadra, UP, India; Madan Mohan Malaviya University of Technology, Gorakhpur, UP, India.
    Kumar, Rakesh
    Madan Mohan Malaviya University of Technology, Gorakhpur, UP, India.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Prabhu, Sameer
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Operation, Maintenance and Acoustics.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    SignEEG v1.0 : Multimodal Electroencephalography and Signature Database for Biometric Systems2023Manuscript (preprint) (Other academic)
  • 25.
    Nilsson, Mattias
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Juny Pina, Ton
    Bio-Inspired Circuits and Systems (BICS) Lab, Zernike Institute for Advanced Materials, University of Groningen, Groningen, The Netherlands; Groningen Cognitive Systems and Materials Center (CogniGron), University of Groningen, Groningen, The Netherlands.
    Khacef, Lyes
    Bio-Inspired Circuits and Systems (BICS) Lab, Zernike Institute for Advanced Materials, University of Groningen, Groningen, The Netherlands; Groningen Cognitive Systems and Materials Center (CogniGron), University of Groningen, Groningen, The Netherlands.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chicca, Elisabetta
    Bio-Inspired Circuits and Systems (BICS) Lab, Zernike Institute for Advanced Materials, University of Groningen, Groningen, The Netherlands; Groningen Cognitive Systems and Materials Center (CogniGron), University of Groningen, Groningen, The Netherlands.
    Sandin, Fredrik
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons2023In: 2023 International Joint Conference on Neural Networks (IJCNN): Conference Proceedings, IEEE, 2023Conference paper (Refereed)
    Abstract [en]

    With the expansion of AI-powered virtual assistants, there is a need for low-power keyword spotting systems providing a “wake-up” mechanism for subsequent computationally expensive speech recognition. One promising approach is the use of neuromorphic sensors and spiking neural networks (SNNs) implemented in neuromorphic processors for sparse event-driven sensing. However, this requires resource-efficient SNN mechanisms for temporal encoding, which need to consider that these systems process information in a streaming manner, with physical time being an intrinsic property of their operation. In this work, two candidate neurocomputational elements for temporal encoding and feature extraction in SNNs described in recent literature—the spiking time-difference encoder (TDE) and disynaptic excitatory-inhibitory (E-I) elements—are comparatively investigated in a keyword-spotting task on formants computed from spoken digits in the TIDIGITS dataset. While both encoders improve performance over direct classification of the formant features in the training data, enabling a complete binary classification with a logistic regression model, they show no clear improvements on the test set. Resource-efficient keyword spotting applications may benefit from the use of these encoders, but further work on methods for learning the time constants and weights is required to investigate their full potential.

  • 26.
    Nilsson, Mattias
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Sandin, Fredrik
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Spatiotemporal Pattern Recognition in Single Mixed-Signal VLSI Neurons with Heterogeneous Dynamic Synapses2022In: Proceedings of International Conference on Neuromorphic Systems 2022, Association for Computing Machinery (ACM), 2022, article id 4Conference paper (Refereed)
    Abstract [en]

    Mixed-signal neuromorphic processors with brain-like organization and device physics offer an ultra-low-power alternative to the unsustainable developments of conventional deep learning and computing. However, realizing the potential of such neuromorphic hardware requires efficient use of its heterogeneous, analog neurosynaptic circuitry with neurocomputational methods for sparse, spike-timing-based encoding and processing. Here, we investigate the use of balanced excitatory–inhibitory disynaptic lateral connections as a resource-efficient mechanism for implementing a thalamocortically inspired Spatiotemporal Correlator (STC) neural network without using dedicated delay mechanisms. We present hardware-in-the-loop experiments with a DYNAP-SE neuromorphic processor, in which receptive fields of heterogeneous coincidence-detection neurons in an STC network with four lateral afferent connections per column were mapped by random input-sampling. Furthermore, we demonstrate how such a neuron was tuned to detect a particular spatiotemporal feature by discrete address-reprogramming of the analog synaptic circuits. The energy dissipation of the disynaptic connections is one order of magnitude lower per lateral connection (0.65 nJ vs 9.6 nJ per spike) than in the former delay-based hardware implementation of the STC.

    Download full text (pdf)
    Manuskript (preprint)
    Download full text (pdf)
    fulltext
  • 27.
    Nilsson, Mattias
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Sandin, Fredrik
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Synaptic Integration of Spatiotemporal Features with a Dynamic Neuromorphic Processor2020In: 2020 International Joint Conference on Neural Networks (IJCNN): Conference Proceedings, IEEE, 2020, article id 21471Conference paper (Refereed)
    Abstract [en]

    Spiking neurons can perform spatiotemporal feature detection by nonlinear synaptic and dendritic integration of presynaptic spike patterns. Multicompartment models of nonlinear dendrites and related neuromorphic circuit designs enable faithful imitation of such dynamic integration processes, but these approaches are also associated with a relatively high computing cost or circuit size. Here, we investigate synaptic integration of spatiotemporal spike patterns with multiple dynamic synapses on point-neurons in the DYNAP-SE neuromorphic processor, which offers a complementary resource-efficient, albeit less flexible, approach to feature detection. We investigate how previously proposed excitatory–inhibitory pairs of dynamic synapses can be combined to integrate multiple inputs, and we generalize that concept to a case in which one inhibitory synapse is combined with multiple excitatory synapses. We characterize the resulting delayed excitatory postsynaptic potentials (EPSPs) by measuring and analyzing the membrane potentials of the neuromorphic neuronal circuits. We find that biologically relevant EPSP delays, with variability of order 10 milliseconds per neuron, can be realized in the proposed manner by selecting different synapse combinations, thanks to device mismatch. Based on these results, we demonstrate that a single point-neuron with dynamic synapses in the DYNAP-SE can respond selectively to presynaptic spikes with a particular spatiotemporal structure, which enables, for instance, visual feature tuning of single neurons.

    Download full text (pdf)
    Nilsson_2020_Synaptic_Integration
  • 28.
    Rakesh, Sumit
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Singh, Dinesh
    Computer Science & Engineering, DCRUST, Murthal, Sonepat, India.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. Department of CSE, IIT Roorkee, Roorkee, India.
    Emotions Classification Using EEG in Health Care2023In: Computer Vision and Machine Intelligence: Proceedings of CVMI 2022 / [ed] Tistarelli, Massimo; Dubey, Shiv Ram; Singh, Satish Kumar; Jiang, Xiaoyi, Springer Nature, 2023, p. 37-49Conference paper (Refereed)
  • 29.
    Sabry, Sana Sabah
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Adewumi, Tosin
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Abid, Nosheen
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    HaT5: Hate Language Identification using Text-to-Text Transfer Transformer2022In: 2022 International Joint Conference on Neural Networks (IJCNN): Conference Proceedings, Institute of Electrical and Electronics Engineers (IEEE), 2022Conference paper (Refereed)
    Abstract [en]

    We investigate the performance of a state-of-the-art (SoTA) architecture T5 (available on the SuperGLUE) and compare it with 3 other previous SoTA architectures across 5 different tasks from 2 relatively diverse datasets. The datasets are diverse in terms of the number and types of tasks they have. To improve performance, we augment the training data by using a new autoregressive conversational AI model checkpoint. We achieve near-SoTA results on a couple of the tasks - macro F1 scores of 81.66% for task A of the OLID 2019 dataset and 82.54% for task A of the hate speech and offensive content (HASOC) 2021 dataset, where SoTA are 82.9% and 83.05%, respectively. We perform error analysis and explain why one of the models (Bi-LSTM) makes the predictions it does by using a publicly available algorithm: Integrated Gradient (IG). This is because explainable artificial intelligence (XAI) is essential for earning the trust of users. The main contributions of this work are the implementation method of T5, which is discussed; the data augmentation, which brought performance improvements; and the revelation on the shortcomings of the HASOC 2021 dataset. The revelation shows the difficulties of poor data annotation by using a small set of examples where the T5 model made the correct predictions, even when the ground truth of the test set were incorrect (in our opinion). We also provide our model checkpoints on the HuggingFace hub1. https://huggingface.co/sana-ngu/HaT5_augmentation https://huggingface.co/sana-ngu/HaT5.

  • 30.
    Saini, Rajkumar
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Dobson, Derek
    FamilySearch, USA.
    Morrey, Jon
    FamilySearch, USA.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records2019In: The 15th IAPR International Conference on Document Analysis and Recognition: ICDAR 2019, Piscataway, New Jersey, USA: IEEE, 2019, p. 1499-1504Conference paper (Refereed)
    Abstract [en]

    In this paper, we present a large historical database of Chinese family records with the aim to develop robust systems for historical document analysis. In this direction, we propose a Historical Document Reading Challenge on Large Chinese Structured Family Records (ICDAR 2019 HDRCCHINESE).The objective of the competition is to recognizeand analyze the layout, and finally detect and recognize thetextlines and characters of the large historical document image dataset containing more than 10000 pages. Cascade R-CNN, CRNN, and U-Net based architectures were trained to evaluatethe performances in these tasks. Error rate of 0.01 has been recorded for textline recognition (Task1) whereas a Jaccard Index of 99.54% has been recorded for layout analysis (Task2).The graph edit distance based total error ratio of 1.5% has been recorded for complete integrated textline detection andrecognition (Task3).

  • 31.
    Saini, Rajkumar
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Prabhu, Sameer
    Data Ductus AB, Luleå, Sweden.
    Upadhyay, Richa
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Chhipa, Prakash Chandra
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Imagined Object Recognition Using EEG-Based Neurological Brain Signals2022In: Recent Trends in Image Processing and Pattern Recognition (RTIP2R 2021) / [ed] KC Santosh, Ravindra Hegadi, Umapada Pal, Springer, 2022, p. 305-319Conference paper (Refereed)
    Abstract [en]

    Researchers have been using Electroencephalography (EEG) to build Brain-Computer Interfaces (BCIs) systems. They have had a lot of success modeling brain signals for applications, including emotion detection, user identification, authentication, and control. The goal of this study is to employ EEG-based neurological brain signals to recognize imagined objects. The user imagines the object after looking at the same on the monitor screen. The EEG signal is recorded when the user thinks up about the object. These EEG signals were processed using signal processing methods, and machine learning algorithms were trained to classify the EEG signals. The study involves coarse and fine level EEG signal classification. The coarse-level classification categorizes the signals into three classes (Char, Digit, Object), whereas the fine-level classification categorizes the EEG signals into 30 classes. The recognition rates of 97.30%, and 93.64% were recorded at coarse and fine level classification, respectively. Experiments indicate the proposed work outperforms the previous methods.

  • 32.
    Shridhar, Kumar
    et al.
    MindGarage, Technical University Kaiserslautern, Germany.
    Dash, Ayushman
    MindGarage, Technical University Kaiserslautern, Germany.
    Sahu, Amit
    MindGarage, Technical University Kaiserslautern, Germany.
    Grund Pihlgren, Gustav
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Alonso, Pedro
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Pondenkandath, Vinaychandran
    University of Fribourg, Switzerland.
    Kovács, György
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Simistira, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. University of Fribourg, Switzerland.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab. MindGarage, Technical University Kaiserslautern, Germany. University of Fribourg, Switzerland.
    Subword Semantic Hashing for Intent Classification on Small Datasets2019In: 2019 International Joint Conference on Neural Networks (IJCNN), IEEE, 2019, article id N-19329Conference paper (Other academic)
    Abstract [en]

    In this paper, we introduce the use of Semantic Hashing as embedding for the task of Intent Classification and achieve state-of-the-art performance on three frequently used benchmarks. Intent Classification on a small dataset is a challenging task for data-hungry state-of-the-art Deep Learning based systems. Semantic Hashing is an attempt to overcome such a challenge and learn robust text classification. Current word embedding based methods [11], [13], [14] are dependent on vocabularies. One of the major drawbacks of such methods is out-of-vocabulary terms, especially when having small training datasets and using a wider vocabulary. This is the case in Intent Classification for chatbots, where typically small datasets are extracted from internet communication. Two problems arise with the use of internet communication. First, such datasets miss a lot of terms in the vocabulary to use word embeddings efficiently. Second, users frequently make spelling errors. Typically, the models for intent classification are not trained with spelling errors and it is difficult to think about ways in which users will make mistakes. Models depending on a word vocabulary will always face such issues. An ideal classifier should handle spelling errors inherently. With Semantic Hashing, we overcome these challenges and achieve state-of-the-art results on three datasets: Chatbot, Ask Ubuntu, and Web Applications [3]. Our benchmarks are available online.

  • 33.
    Simistira, Fotini
    et al.
    Institute for Language and Speech Processing, Athena Research and Innovation Center, Athens, Greece.
    Ul-hassan, Adnan
    Department of Computer Science, University of Kaiserslauten, Germany.
    Papavassiliou, Vassilis
    Institute for Language and Speech Processing, Athena Research and Innovation Center, Athens, Greece.
    Gatos, Basilis
    NCSR Demokritos, Institute of Informatics and Telecommunications, Athens, Greece.
    Katsouros, Vassilis
    Institute for Language and Speech Processing, Athena Research and Innovation Center, Athens, Greece.
    Liwicki, Marcus
    Department of Computer Science, University of Kaiserslautern, Germany; DIVA Research Group, University of Fribourg, Switzerland.
    Recognition of Historical Greek Polytonic Scripts Using LSTM Networks2015In: 13th International Conference on Document Analysis and Recognition, IEEE , 2015, p. 766-770Conference paper (Refereed)
    Abstract [en]

    This paper reports on high-performance Optical Character Recognition (OCR) experiments using Long Short- Term Memory (LSTM) Networks for Greek polytonic script. Even though there are many Greek polytonic manuscripts, the digitization of such documents has not been widely applied, and very limited work has been done on the recognition of such scripts. We have collected a large number of diverse document pages of Greek polytonic scripts in a novel database, called Polyton-DB, containing 15; 689 textlines of synthetic and authentic printed scripts and performed baseline experiments using LSTM Networks. Evaluation results show that the character error rate obtained with LSTM varies from 5,51% to 14,68% (depending on the document) and is better than two well-known OCR engines, namely, Tesseract and ABBYY FineReader

    Download full text (pdf)
    fulltext
  • 34.
    Simistira Liwicki, Foteini
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Abid, Nosheen
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Wellington, Scott
    University of Bath, Department of Computer Science, Bath, UK.
    Wilson, Holly
    University of Bath, Department of Computer Science, Bath, UK.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Eriksson, Johan
    Umeå University, Department of Integrative Medical Biology (IMB) and Umeå Center for Functional Brain Imaging (UFBI), Umeå, Sweden.
    Bimodal electroencephalography-functional magnetic resonance imaging dataset for inner-speech recognition2023In: Scientific Data, E-ISSN 2052-4463, Vol. 10, article id 378Article in journal (Refereed)
    Abstract [en]

    The recognition of inner speech, which could give a ‘voice’ to patients that have no ability to speak or move, is a challenge for brain-computer interfaces (BCIs). A shortcoming of the available datasets is that they do not combine modalities to increase the performance of inner speech recognition. Multimodal datasets of brain data enable the fusion of neuroimaging modalities with complimentary properties, such as the high spatial resolution of functional magnetic resonance imaging (fMRI) and the temporal resolution of electroencephalography (EEG), and therefore are promising for decoding inner speech. This paper presents the first publicly available bimodal dataset containing EEG and fMRI data acquired nonsimultaneously during inner-speech production. Data were obtained from four healthy, right-handed participants during an inner-speech task with words in either a social or numerical category. Each of the 8-word stimuli were assessed with 40 trials, resulting in 320 trials in each modality for each participant. The aim of this work is to provide a publicly available bimodal dataset on inner speech, contributing towards speech prostheses.

    Download full text (pdf)
    fulltext
  • 35.
    Simistira Liwicki, Foteini
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Gupta, Vibha
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    De, Kanjar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Abid, Nosheen
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Wellington, Scott
    Department of Computer Science, University of Bath, United Kingdom.
    Wilson, Holly
    Department of Computer Science, University of Bath, United Kingdom.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Eriksson, Johan
    Department of Integrative Medical Biology (IMB), Umeå University, Sweden.
    Bimodal pilot study on inner speech decoding reveals the potential of combining EEG and fMRIManuscript (preprint) (Other academic)
    Abstract [en]

    This paper presents the first publicly available bimodal electroencephalography (EEG) / functional magnetic resonance imaging (fMRI) dataset and an open source benchmark for inner speech decoding. Decoding inner speech or thought (expressed through a voice without actual speaking); is a challenge with typical results close to chance level. The dataset comprises 1280 trials (4 subjects, 8 stimuli = 2 categories * 4 words, and 40 trials per stimuli) in each modality. The pilot study reports for the binary classification, a mean accuracy of 71.72\% when combining the two modalities (EEG and fMRI), compared to 62.81% and 56.17% when using EEG, resp. fMRI alone. The same improvement in performance for word classification (8 classes) can be observed (30.29% with combination, 22.19%, and 17.50% without). As such, this paper demonstrates that combining EEG with fMRI is a promising direction for inner speech decoding.

    Download full text (pdf)
    fulltext
  • 36.
    Simistira Liwicki, Foteini
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Perise, Pedro Malo
    University of Zaragoza, Spain.
    Visi, Federico
    Luleå University of Technology, Department of Arts, Communication and Education, Music, media and Theatre.
    Östersjö, Stefan
    Luleå University of Technology, Department of Arts, Communication and Education, Music, media and Theatre.
    Analysing Musical Performance in Videos Using Deep Neural Networks2020In: Proceedings of the 1st Joint Conference on AI Music Creativity, AIMC, Stockholm, Sweden, 2020Conference paper (Refereed)
    Abstract [en]

    This paper proposes a method to facilitate labelling of music performance videos with automatic methods (3D-Convolutional Neural Networks) instead of tedious labelling by human experts. In particular, we are interested in the detection of the 17 musical performance gestures generated during the performance (guitar play) of musical pieces which have been video-recorded. In earlier work, these videos have been annotated manually by a human expert according to the labels in the musical analysis methodology. Such a labelling method is time-consuming and would not be scalable to big collections of video recordings. In this paper, we use a 3D-CNN model from activity recognition tasks and adapt it to the music performance dataset following a transfer learning approach. In particular, the weights of the first blocks were kept and only the later layers as well as additional classification layers were re-trained. The model was evaluated on a set of 17 music performance gestures and reports an average accuracy of 97.9% (F1:77.8%) on the training set and 85.7% (F1:38.6%) on the test set. An additional analysis shows which gestures are particularly difficult and suggest improvements for future work.

  • 37.
    Simán, Filip
    et al.
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Jansson, Nils F.
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Johnson, Sean
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rincon, Jonathan
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Nordfeldt, Erik
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Persson, Mac F.
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    McDonnell, Paul
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Hermansson, Tobias
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Fitting Rävliden North Zn-Pb-Ag-Cu deposit host stratigraphy into regional Skellefte district nomenclature2022In: Geological Society of Sweden, 150 year anniversary meeting: Abstract volume / [ed] Bergman Weihed, J.; Johansson, Å.; Rehnström, E., Geologiska Föreningen , 2022, p. 156-157Conference paper (Other academic)
    Abstract [en]

    Lithofacies logging from Rävliden North in the Skellefte district is presented, and the use of lithostratigraphic names in deposit scale mapping is discussed. The authors conclude that while Skellefte district nomenclature can be applied, it cannot preserve the level of detail relevant to exploration

    Download full text (pdf)
    fulltext
  • 38.
    Simán, Filip
    et al.
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Jansson, Nils
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Johnson, Sean
    Boliden Mineral AB.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Nordfeldt, Erik
    Boliden Mineral AB.
    Persson, Mac Fjellerad
    Boliden Mineral AB.
    McDonnell, Paul
    Boliden Mineral AB.
    Hermansson, Tobias
    Boliden Mineral AB.
    Understanding lithostratigraphy and deposition of the Rävliden Zn-Pb-Ag-Cu deposit, Sweden2022In: Halifax 2022: Abstract volume, The Geological Association of Canada, 2022, p. 201-Conference paper (Other academic)
    Download full text (pdf)
    fulltext
  • 39.
    Simán, Filip
    et al.
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Jansson, Nils
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Kampmann, Tobias Christoph
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rock Classification with Machine Learning: a Case Study from the Zinkgruvan Zn-Pb-Ag Deposit, Bergslagen, Sweden2021In: 2021 Swedish Artificial Intelligence Society Workshop (SAIS), IEEE, 2021, p. 37-41Conference paper (Refereed)
    Abstract [en]

    In this paper we assess two traditional machine learning (ML) methods which can be used for automatic rock type classification: (1) the Self-Organising Map (SOM) with k-means clustering, and (2) Classification and Regression Trees (CART). The dataset used for this paper were chemical compositional data of rocks acquired through X-Ray Fluorescence (XRF) analysis. The ground truth of the dataset was generated by human experts in the field of geology. The complexity of the chosen dataset influenced the evaluation performance of the two ML models. We achieve an overall accuracy of 68.02 % and 62.79 % respectively when using SOM with k-means and CART.

  • 40.
    Simán, Filip
    et al.
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Jansson, Nils
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Günther, Christian
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Nordfeldt, Erik
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Fjellerad Persson, Mac
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    McDonnell, Paul
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Hermansson, Tobias
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Hydrothermal alteration at the Rävliden North VMS deposit, Skellefte district, SwedenManuscript (preprint) (Other academic)
    Abstract [en]

    Alteration envelopes around volcanogenic massive sulphide (VMS) deposits are commonly several kilometres larger than the associated mineral deposit, making them suitable exploration targets. Furthermore, these alteration envelopes are usually zoned with different alteration types and variable alteration intensity at different distance to mineralisation, whereby they can be used as ore vectors. However, alteration mapping is subjective and can vary between geologists. Therefore, quantitative approaches using whole rock lithogeochemistry are useful for mapping alteration. One such approach uses mass balance calculations to quantify mass changes in mobile elements resulting from alteration. A challenge with this technique is that it relies on sampling least-altered volcanic rocks representative of precursor compositions. Crucially, mass balance calculation depends on how well constrained the least-altered samples are and as such, these samples are chosen with great care; however, in some exploration projects sub-optimal least-altered sample choices are made. Furthermore, uncertainties stemming from sampling- and analytical errors and uncertainty in modelling fractionation curves for major mobile elements influences the certainty of variables going into mass balance calculations. It is therefore of interest to study the impact of least-altered sample choices and error propagation on the final mass balance calculation allowing future studies to make more informed decisions regarding selection criteria for least-altered samples.

    This study uses the Rävliden North deposit in the western part of the Palaeoproterozoic Skellefte district, Northern Sweden, as a natural laboratory to test the impact of least-altered sample choices. This deposit is a relatively recent Zn-Pb-Ag-Cu VMS discovery approximately 4 km west of the currently operated Kristineberg mine. Alteration envelopes of with varied amounts of quartz, sericite, chlorite and talc are commonly associated with VMS deposits in the Skellefte district, and with an alteration intensity locally strong enough to eradicate textures of the original lithofacies. Furthermore, the deposits are modified by polyphase deformation, greenschist to amphibolite facies metamorphism, and remobilization. Combined these make stratigraphic analysis and lithofacies mapping difficult, which motivates the use of lithogeochemical techniques when exploring for these deposits. In the Rävliden North area, two styles of VMS mineralization types occur: 1) a semi-massive to massive Sp+Pyh+Gn±Py hosted in the Rävliden formation in upper parts of the Skellefte group (SG), and 2) a stringer Ccp+Pyh+Py mineralisation occurring in both the SG and Rävliden formation.

    By the comparison of mass change results from calculations using two datasets with differently constrained least-altered sample choices this study concludes that absolute mass changes are sensitive to different least-altered sample choices and that with a 50% confidence interval on regression an uncertainty in mass change of approximately 0.5 wt.% for MgO, FeO, CaO, and 0.2 for K2O and Na2O, and 5 wt.% for SiO2 is found. However, regardless of the least-altered sample choice, qualitative recognition of ore vectors is possible. Furthermore, this study find the following ore vectors to Rävliden North: 1) semi-regional Na2O and CaO loss, 2) distal K2O gain, 3) proximal K2O loss, 4) proximal CaO gain is associated with the Sp+Pyh+Gn±Py mineralisation, 5) proximal MgO and Fe2O3 gains associated with both the Sp+Pyh+Gn±Py and Ccp+Pyh+Py mineralisation, and 6) proximal erratic gains and losses in SiO2 is associated with both mineralisation types.

  • 41.
    Simán, Filip
    et al.
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Jansson, Nils
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Günther, Christian
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Nordfeldt, Erik
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Fjellerad Persson, Mac
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    McDonnell, Paul
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Hermansson, Tobias
    Exploration Department, Boliden Mineral AB, Boliden, Sweden.
    Stratigraphy, Facies, and Chemostratigraphy at the Palaeoproterozoic Rävliden North Zn-Pb-Ag-Cu VMS deposit, Skellefte district, SwedenManuscript (preprint) (Other academic)
    Abstract [en]

    The Rävliden North deposit in the western part of the Palaeoproterozoic Skellefte district, Northern Sweden, is a relatively recent Zn-Pb-Ag-Cu volcanogenic massive sulphide (VMS) discovery c. 4 km west of the currently operated Kristineberg mine. The VMS deposits in the Skellefte district are hosted in greenschist to amphibolite facies rocks and occur at the lithostratigraphic contact between the metavolcanic 1.89 – 1.88 Ga Skellefte group (SG) and stratigraphically overlying metasiliciclastic 1.89 – 1.87 Ga Vargfors group (VG). All of the deposits are associated with hydrothermal alteration envelopes of quartz+sericite±chlorite±talc where intense alteration commonly eradicates original rock textures, making geological interpretation difficult. Hence, to complement lithofacies analysis, immobile element ratio chemostratigraphy is used. Furthermore, the Rävliden North area is modified by polyphase deformation and remobilization making stratigraphic correlation difficult. 

    This study concludes that: 1) the Rävliden North VMS deposit is hosted in upper parts of the SG, where a semi-massive to massive sphalerite+pyrrhotite+galena±pyrite is hosted in and above a stratigraphic unit herein defined as the Rävliden formation, and stringer-type chalcopyrite+pyrrhotite+pyrite mineralisation occurs structurally and stratigraphically below in both the SG and Rävliden formation. The Rävliden formation represents a heterogenous volcanic succession with rhyolite, dacite and andesite deposited in a deep-marine environment, 2) VMS deposition was formed by subseafloor replacement and is associated with intensely altered rhyolite and graphitic phyllite, where the former saw early Cal alteration and was as such a suitable reactive host for metalliferous fluids, and the latter may have functioned as a sealing layer concentrating fluid flow, 3) Post-mineralisation andesitic cryptodomes, andesitic turbidites and mafic mass-flow deposits suggest continued volcanic activity during the deposition of the VG that otherwise consists of graphitic phyllite, breccia-conglomerate and silt- and sandstone. The mafic mass-flow deposits suggest that a primitive source was tapped at a late extensional phase in the Skellefte district.

  • 42.
    Simán, Filip
    et al.
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Jansson, Nils
    Luleå University of Technology, Department of Civil, Environmental and Natural Resources Engineering, Geosciences and Environmental Engineering.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Hermansson, Tobias
    Boliden Mines, Boliden, Sweden.
    Nordfeldt, Erik
    Boliden Mines, Boliden, Sweden.
    Fjellerad Persson, Mac
    Boliden Mines, Boliden, Sweden.
    McDonnell, Paul
    Boliden Mines, Boliden, Sweden.
    Johnson, Sean
    Boliden Mines, Boliden, Sweden.
    Gustafsson, Jon
    Boliden Mines, Boliden, Sweden.
    Linking Lithofacies and Chemostratigraphy, Rävliden North VHMS deposit, Skellefte district, Sweden2023In: Proceedings of the 17th SGA Biennial Meeting, 28 August – 1 September 2023, Zurich, Switzerland, The Society for Geology Applied to Mineral Deposits (SGA) , 2023, Vol. 2, p. 72-75Conference paper (Refereed)
  • 43.
    Vacalopoulou, A.
    et al.
    ILSP / Athena R.C., GREECE.
    Gardelli, Viktor
    Luleå University of Technology, Department of Health, Education and Technology, Education, Language, and Teaching.
    Karafyllidis, T.
    University of Cyprus, CYPRUS.
    Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Papaevripidou, M.
    University of Cyprus, CYPRUS.
    Paraskevopoulos, G.
    ILSP / Athena R.C., GREECE.
    Stamouli, S.
    ILSP / Athena R.C., GREECE.
    Katsamanis, A.
    ILSP / Athena R.C., GREECE.
    Katsouros, V.
    ILSP / Athena R.C., GREECE.
    AI4EDU: An Innovative Conversational Ai Assistant For Teaching And Learning2024In: INTED2024 Conference Proceedings / [ed] Luis Gómez Chova; Chelo González Martínez; Joanna Lees, IATED Academy , 2024, p. 7119-7127Conference paper (Refereed)
  • 44.
    Xie, Yejing
    et al.
    Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, 44000, Nantes, France.
    Mouchère, Harold
    Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, 44000, Nantes, France.
    Simistira Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Rakesh, Sumit
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Saini, Rajkumar
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Nakagawa, Masaki
    Tokyo University of Agriculture and Technology, Fuchu, Japan.
    Nguyen, Cuong Tuan
    FPT University, Hanoi, Vietnam.
    Truong, Thanh-Nghia
    Tokyo University of Agriculture and Technology, Fuchu, Japan.
    ICDAR 2023 CROHME: Competition on Recognition of Handwritten Mathematical Expressions2023In: Document Analysis and Recognition - ICDAR 2023, Part II / [ed] Gernot A. Fink, Rajiv Jain, Koichi Kise & Richard Zanibbi, Springer, 2023, p. 553-565Conference paper (Refereed)
    Abstract [en]

    This paper overviews the 7th edition of the Competition on Recognition of Handwritten Mathematical Expressions. ICDAR 2023 CROHME proposes three tasks with three different modalities: on-line, off-line and bimodal. 3905 new handwritten equations have been collected to propose new training, validation and test sets for the two modalities. The complete training set includes previous CROHME training set extented with complementary off-line (from OffRaSHME competition) and on-line samples (generated). The evaluation is conducted using the same protocol as the previous CROHME, allowing a fair comparison with previous results. This competition allows for the first time the comparison of the on-line and off-line systems on the same test set. Six participating teams have been evaluated. Finally the same team won all 3 tasks with more than 80% of expression recognition rate.

  • 45.
    Zarris, Dimitrios
    et al.
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Sozos, Stergios
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Simistira Liwicki, Foteini
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Gardelli, Viktor
    Luleå University of Technology, Department of Health, Education and Technology, Education, Language, and Teaching.
    Karafyllidis, Teodoris
    University of Cyprus, Cyprus.
    Stamouli, Spyridoula
    Athena Research Center, Greece.
    Papaevripidou, Marios
    University of Cyprus, Cyprus.
    Vacalopoulou, Anna
    Athena Research Center, Greece.
    Paraskevopoulos, George
    Athena Research Center, Greece.
    Katsamanis, Nassos
    Athena Research Center, Greece.
    Katsouros, Vassilis
    Athena Research Center, Greece.
    Liwicki, Marcus
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Mokayed, Hamam
    Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Embedded Internet Systems Lab.
    Enhancing Educational Paradigms with Large Language Models: From Teacher to Study Assistants in Personalized Learning2024In: EDULEARN24 Proceedings: 16th International Conference on Education and New Learning Technologies 1-3 July, 2024, Palma, Spain / [ed] Luis Gómez Chova; Chelo González Martínez; Joanna Lees, IATED Academy , 2024, p. 1295-1303Conference paper (Refereed)
    Abstract [en]

    This paper investigates the application of large language models (LLMs) in the educational field, specifically focusing on roles like "Teacher Assistant" and "Study Assistant" to enhance personalized and adaptive learning. The significance of integrating AI in educational frameworks is underscored, given the shift towards AI-powered educational tools. The methodology of this research is structured and multifaceted, examining the dynamics between prompt engineering, methodological approaches, and LLM outputs with the help of indexed documents. The study bifurcates its approach into prompt structuring and advanced prompt engineering techniques. Initial investigations revolve around persona and template prompts to evaluate their individual and collective effects on LLM outputs. Advanced techniques, including few-shot and chain-of-thought prompting, are analyzed for their potential to elevate the quality and specificity of LLM responses. The "Study Assistant" aspect of the study involves applying these techniques to educational content across disciplines such as biology, mathematics, and physics. Findings from this research are poised to contribute significantly to the evolution of AI in education, offering insights into the variables that enhance LLM performance. This paper not only enriches the academic discourse on LLMs but also provides actionable insights for the development of sophisticated AI-based educational tools. As the educational landscape continues to evolve, this research underscores the imperative for continuous exploration and refinement in the application of AI to fully realize its benefits in education.

1 - 45 of 45
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf