We say what we do, we do what we say
and why it works!
AI is not magic. It is linear algebra, probabilities, tensor computations, optimisation and a great deal of engineering.
At Cybelia Cloud, we work at the model level: architecture choice, loss function, regularisation, cross-validation.
We support companies with serious projects at the frontier of mathematics and computer science, looking for a partner able to hold a technical conversation.
Four technical pillars. One single requirement: rigour.
From raw signal to production model — we cover the entire chain.
Computer vision
CNNs, convolutions, pooling, object detection (YOLO, R-CNN), semantic segmentation. From raw image to feature vector.
OCR & Recognition
Tesseract pipeline, OpenCV preprocessing, hOCR, NLP post-correction. Structured extraction from scanned documents, invoices and forms.
NLP & Voice
STT on Android with Sherpa-onnx. Acoustic model, language model, MFCC, VAD. WER as the reference metric.
Machine Learning
Supervised and unsupervised models, feature engineering, GridSearchCV, cross-validation, F1/AUC/mAP metrics. PyTorch, TensorFlow, scikit-learn.
OCR on administrative documents and invoices
Real problems. Solutions that ship.
1. OpenCV preprocessing
deskew, adaptive binarisation (Otsu), denoising (median filter, morphology).
2. Zone segmentation
text block, table and field detection via contour analysis.
3. Tesseract recognition (LSTM)
psm 6 config, fine-tuning on a business corpus.
4. NLP post-correction
error detection with a domain dictionary, correction via Levenshtein distance.
5. JSON structuring
field mapping → target schema, validation via business rules.
Result: recognition rate > 92% on the test corpus, processing time < 800 ms per page.
Object detection and classification with CNNs
Problem: identify and locate specific items in a video stream or industrial images.
Architecture:
- CNN backbone — convolutional layers (3×3, stride 1),
batch normalisation, ReLU, max pooling.
- Transfer learning from ResNet-50 pretrained on ImageNet —
fine-tuning the last layers on the business dataset.
- Detection head — bounding-box regression +
multiclass classification (Softmax).
- Combined loss: BCE for classification + L1/IoU for localisation.
Training: PyTorch, Adam (lr=1e-4),
cosine annealing scheduler, data augmentation (flip, crop, jitter).
mAP@0.5: 87.3% on the validation set.
Predictive model on business data
Problem: anticipate a business event
(failure, churn, anomaly) from heterogeneous historical data.
Methodology:
- Exploration and cleansing — missing values (KNN imputation),
outliers (IQR), categorical encoding (target encoding).
- Feature engineering — sliding time windows,
statistical aggregates, derived features.
- Model selection — Random Forest, XGBoost,
LightGBM compared with stratified cross-validation (k=5).
- Hyperparameter tuning — Optuna / GridSearchCV.
- Interpretability — SHAP values for business explainability.
Metrics: F1-score 0.89, AUC-ROC 0.94 on the test set.
Open source tools, proven, documented and maintainable
No opaque proprietary frameworks. Every brick is audited, understood and mastered.
Vision & Image
OpenCV · Pillow · scikit-image · Tesseract 5 · PyTorch · torchvision · ONNX Runtime
Audio & NLP
Vosk · Sherpa-onnx · WebRTC VAD · NLTK · spaCy · HuggingFace Transformers · Kaldi
ML & Data
scikit-learn · XGBoost · LightGBM · Optuna · SHAP · Pandas · NumPy · Matplotlib
From the problem to the model in production — no detour
Scientific framing
We start by understanding the real problem, not the imagined solution. Formal definition of the task, inputs/outputs and success metrics.
Data & exploration
Audit of the available data — volume, quality, bias, distribution. We promise nothing before having seen the data.
Experimentation & baseline
Setting up a simple baseline model, then controlled iterations tracking metrics. Reproducibility guaranteed.
Deployment & integration
ONNX export, REST API or native integration (Android JNI, Python module). Technical documentation delivered with the model.
You have a hard problem you want solved properly.
CIOs & technical leaders
You have an AI project under way or under consideration and you're looking for a rigorous outside opinion to frame, evaluate or de-risk it.
Deeptech startups
You have a strong idea but lack ML/vision/NLP resources to take a POC to product.
R&D project owners
You're working on a subject at the frontier of AI and mathematics and you need a technical partner, not a general-purpose service provider.
You have a hard problem. We love that.
Describe your project in a few lines — we'll respond with a technical analysis, not a sales quote.