About the Role

Location: Remote / Client Location

Experience: 3+ Years

Employment Type: Full-Time (Client Deployment via Hubnex Labs)

Role Overview

Join a pioneering AI team to design and deploy cutting-edge deep learning solutions for computer vision and audio analysis. You’ll leverage CNNs, Vision Transformers, attention mechanisms, and multi-modal techniques to solve complex real-world challenges in object detection, video processing, and audio classification.

Key Responsibilities

Design, develop, and optimize deep learning models for image/video analysis (object detection, segmentation) and audio classification tasks.
Implement and fine-tune CNN architectures, Vision Transformers (ViT, Swin), and attention mechanisms (SE, CBAM, self/cross-attention).
Process multi-modal data:
Video: Apply spatiotemporal modeling (3D CNNs, temporal attention)
Audio: Extract features (spectrograms, MFCCs) and build classification pipelines
Utilize pretrained models (transfer learning) and multi-task learning frameworks.
Optimize models for accuracy, speed, and robustness using PyTorch/TensorFlow.
Collaborate with MLOps teams to deploy solutions into production.

Required Skills

Programming: Advanced Python (PyTorch/TensorFlow)
Computer Vision:
Vision Transformers (ViT, Swin, DeiT)
Object detection (YOLO, SSD, Faster R-CNN, DETR)
Video analysis (temporal modeling)
Audio Processing: Feature extraction (MFCCs, spectrograms) and classification
Modeling Expertise:
Attention mechanisms (self/cross-attention, SE, CBAM)
Transfer learning and fine-tuning
Training strategies (LR scheduling, early stopping, data augmentation)
Experience handling large-scale datasets and building data pipelines.

Preferred Qualifications

Exposure to multi-modal learning (combining vision/audio/text)
Familiarity with R for statistical analysis
Publications or projects in CVPR/NeurIPS/ICML

This role is for a client of Hubnex Labs. Selected candidates will represent Hubnex while working directly with the client’s AI team.

Skills: data handling,audio processing,video analysis,tensorflow,aiml,attention mechanisms,r,computer vision,vision transformers,object detection,transfer learning,feature extraction,advanced python,pytorch

Deep Learning Engineer (Computer Vision & Audio Analysis)

About the Role

Apply for this position

Log in or Sign up to Apply

Application Status

Similar Jobs

Earn Credits

Share Job Post

Add Recruiter