Marcos García

Featured Projects

Click on any project to explore the details, technologies used, and access live demos or repositories.

New Project

Virtual guide for Menorca Talayótica with RAG

Featured Project

Awi - Menorca Talayótica Guide

RAG AI LlamaIndex

Multi-agent system for automated ML training

Featured Project

MIDAS

AI Multi-Agent System Machine Learning

Interactive storytelling with AI assistance

Live Demo Available

LLM StoryTeller

Python Streamlit LLM

Comparing RAG vs Large Context Window approaches

Live Demo Available

Chatbot CV

AI - LLM + NLP RAG Context Window

AI-powered fox detection in images and video

Fox-Detector

AI - Computer Vision Python

Predictive analytics for storage reliability

Hard Drive Failure Prediction

AI - Data + ML Predictive Analytics

Excel to JSONL converter for ChatGPT fine-tuning

XLSX to JSONL

Python Script Data Conversion

Enhanced virtual space experience and analytics

Gather-Tracker

JS Bot/Script Analytics

Awi - Menorca Talayótica Guide

RAG Speech-to-Text Text-to-Speech Gemini Flash 2.0 LlamaIndex

Project Overview

Awi is a virtual guide created to enhance the visitor experience at Menorca Talayótica, a UNESCO World Heritage candidate site. This AI-powered assistant provides instant access to information about archaeological sites, history, and visitor recommendations through a user-friendly conversational interface.

Key Features

24/7 bilingual virtual guide with extensive knowledge of Menorca's archaeological heritage
Advanced RAG system for accurate information retrieval from verified sources
Voice interaction with OpenAI's TTS and STT capabilities
Responsive design for seamless use on mobile devices while exploring sites

Technical Implementation

Awi leverages a sophisticated RAG architecture built on LlamaIndex, with OpenAI embeddings for semantic search and Gemini Flash 2.0 as the core LLM. The system incorporates voice interaction through OpenAI's text-to-speech and speech-to-text APIs, creating a seamless and natural interaction experience for users exploring archaeological sites.

The knowledge base contains curated information about Menorca's prehistoric monuments, including navetas, talayots, and taulas, as well as practical visitor information. This project represents a practical application of AI to enhance cultural tourism and education.

Real-World Application

Beyond being a technical showcase, Awi addresses a real need for accessible information about Menorca's archaeological heritage. Visitors can ask detailed questions about sites they're exploring, get historical context, and receive personalized recommendations - all through a conversational interface that feels like having a knowledgeable guide available at all times.

Try Awi Live

MIDAS

Artificial Intelligence Multi-Agent System Machine Learning Python

Project Overview

MIDAS is a sophisticated multi-agent system designed to automate the training of machine learning models. It leverages distributed intelligence to optimize model parameters, feature selection, and performance tuning without human intervention.

Key Features

Automated model selection and hyperparameter optimization
Intelligent feature engineering and selection
Collaborative learning between multiple specialized agents
Explainable AI capabilities for understanding model decisions

Technical Implementation

MIDAS utilizes a network of specialized agents, each responsible for different aspects of the machine learning pipeline. The architecture allows for scaling and adapting to various ML tasks while maintaining high performance and accuracy.

Visit MIDAS Website View on GitHub

LLM StoryTeller

LLM StoryTeller is an interactive web application that leverages Large Language Models (LLMs) to help users craft captivating stories effortlessly. This project demonstrates my expertise in Python, Streamlit, and working with LLMs.

Features

Interactive story creation with AI assistance
Multiple story genres and themes to choose from
Character development and plot suggestions
Export stories in multiple formats

Technologies Used

Python Streamlit Large Language Models UI/UX Design

Implementation Highlights

The application uses small language models to generate context-aware story segments based on user input. The Streamlit framework provides an intuitive interface that makes the creative process accessible to users of all skill levels.

Live Demo View on GitHub

Chatbot CV: RAG vs Context Window

Project Overview

This project implements and compares two advanced chatbot architectures applied to curriculum information retrieval. It analyzes the differences in performance, accuracy, and user experience between RAG (Retrieval-Augmented Generation) and Large Context Window approaches.

LLM + NLP RAG Context Window Python JavaScript

Dual Architecture

RAG Approach: LlamaIndex + Llama 3.3 70B with real-time information retrieval
Context Window Approach: Gemini Flash with complete document preloading
Intelligent Classification: BERT model for query complexity analysis
Optimized Backend: Flask-based API with session management

Technical Implementation

RAG Implementation

Uses LlamaIndex with BGE-M3 embeddings to create vector representations of CV content. The system retrieves relevant chunks based on query similarity and augments the LLM's context.

Context Window Implementation

Leverages Gemini 2.0 Flash's million-token context window to load the entire CV into the initial prompt, allowing the model to have a complete view of all information.

Key Findings

The project reveals interesting trade-offs between approaches. RAG offers greater flexibility and can handle larger knowledge bases, while the large context window provides more consistent understanding of document relationships and faster responses for complex queries.

Live Demo View on GitHub

Fox-Detector

Computer Vision Deep Learning Python OpenCV

Project Overview

Fox-Detector is an AI-powered computer vision project that can detect and track foxes in images and video streams. This project showcases my expertise in computer vision and deep learning techniques.

Key Features

Real-time fox detection in video streams
High-accuracy classification model
Motion tracking capabilities
Lightweight deployment options

Technical Implementation

The project uses a convolutional neural network trained on a custom dataset of fox images. The model is optimized for both accuracy and performance, allowing it to run efficiently on various hardware configurations.

View on GitHub

Gather-Tracker

JavaScript Bot Development Analytics Virtual Spaces

Project Overview

Gather-Tracker is a JavaScript bot/script that enhances the Gather virtual space experience. It provides tracking and analytics capabilities for virtual events and meetings, enabling better space management and user engagement analysis.

Key Features

Real-time user activity tracking

Technical Implementation

The bot integrates with Gather's API to collect positional and interaction data. It processes this information in real-time and generates insightful analytics that help event organizers optimize their virtual spaces and understand user behavior patterns.

View on GitHub

XLSX to JSONL

Python Pandas JSON LLM Fine-tuning

Project Overview

XLSX to JSONL is a Python script that converts Excel (.xlsx) files into JSONL format, facilitating the fine-tuning of ChatGPT with structured training and validation datasets. This tool streamlines the preparation of custom training data for language models.

Key Features

Simple Excel to JSONL conversion
Customizable field mapping
Automatic training/validation split
OpenAI API format compatibility

Technical Implementation

The script uses Pandas to read Excel data, allowing it to handle large datasets efficiently. It implements intelligent parsing of conversation pairs and formats them according to OpenAI's fine-tuning specifications, making it easy to prepare custom datasets for LLM training.

View on GitHub

Hard Drive Failure Prediction

Python Machine Learning SMART Data Predictive Analytics

Project Overview

Hard Drive Failure Prediction is a predictive model designed to forecast hard drive failures using manufacturer data, capacity, and SMART metrics. The project aims to boost storage reliability and enable proactive maintenance before catastrophic failures occur.

Key Features

Multi-feature failure prediction model
SMART data analysis and interpretation
Risk assessment and prioritization
Drive lifespan estimation

Technical Implementation

The project uses scikit-learn to implement various machine learning algorithms (Random Forest, Gradient Boosting, and Neural Networks) that analyze patterns in SMART attributes. By training on a large dataset of drives with known outcomes, the model can identify early warning signs of impending failures.

View on GitHub

Experimental Projects

Featured Projects

Awi - Menorca Talayótica Guide

MIDAS

LLM StoryTeller

Chatbot CV

Fox-Detector

Hard Drive Failure Prediction

XLSX to JSONL

Gather-Tracker

Want to collaborate on a project?