VektorPedia: Insect Vector Search Engine (Master Thesis)
🔗 URL:
⏱️ Working Period: August 2022 - December 2023
Overview:
This project, conducted under the guidance of a supervisor as part of my master’s thesis, focused on creating a search engine named VektorPedia for identifying insects that carry plant viruses. By leveraging the power of knowledge graphs (KGs) and the latest advancements in data representation, we aimed to enhance the understanding and management of biodiversity information.
Project Details:
Objective: The primary objective of VektorPedia was to identify insect vectors that carry plant viruses and provide detailed information about these vectors. The search engine aimed to facilitate better understanding and management of plant disease vectors through advanced data integration and network analysis.
Methodology: We performed data ingestion from three existing knowledge graphs: Global Biotic Interaction (GloBI), Wikidata, DBPedia, and NCBITaxonOntology. By integrating these sources, we constructed a comprehensive biodiversity knowledge graph that captures intricate relationships between insects, viruses, and plants.
Network Analysis: Using the integrated knowledge graph, we conducted network analysis to uncover specific interactions between insects, viruses, and plants. This analysis enabled us to identify key insect vectors and their roles in the transmission of plant viruses.
Implementation: The search engine application was developed using Python as the web server and VueJS as the web client. Despite the complexity of the task, we successfully built and deployed the application within three months. The backend, powered by Python, handled data processing and query management, while the VueJS frontend provided a user-friendly interface for users to search and retrieve information about insect vectors.
Achievements:
- Data Ingestion and Integration: Successfully ingested and integrated data from GloBI, Wikidata, DBPedia, and NCBITaxonOntology, creating a rich and interconnected biodiversity knowledge graph.
- Network Analysis: Conducted detailed network analysis on the knowledge graph, revealing critical insights into insect, virus, and plant interactions.
- Search Engine Development: Developed VektorPedia, an efficient and user-friendly search engine for identifying insect vectors, within a tight timeline of three months.
- Advanced Data Representation: Leveraged state-of-the-art knowledge graph technology, showcasing its potential in real-world applications for efficient data management and contextual information retrieval.
Additional Context:
This project exemplifies the cutting-edge application of knowledge graph (KG) technology, which is a state-of-the-art representation of knowledge and a core component in the development of Large Language Models (LLMs) such as GPT and Gemini. KGs offer contextual information and more efficient data management compared to traditional text-based data. VektorPedia stands as a testament to the practical implementation of KG technology in addressing real-world challenges in biodiversity and plant disease management.
Conclusion:
Working on VektorPedia was a pivotal experience that allowed me to delve into the latest advancements in knowledge graph technology and apply them to a critical problem in agriculture and biodiversity. The success of this project underscores the potential of KGs in enhancing data integration, analysis, and retrieval, paving the way for innovative solutions in various domains.
Snapshot :
Summary Idea |
---|
Stages |
Interaction Graph | Embedded Taxon |
---|---|
Scoring Insect Vector | Detail information |