Data Science | Machine Learning with Python for Researchers @datasciencet Channel on Telegram

Data Science | Machine Learning with Python for Researchers

@datasciencet


The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT

Admin: @hussein_sheikho

Data Science | Machine Learning with Python for Researchers (English)

Are you a researcher or an advanced programmer looking to delve deeper into the world of data science and machine learning using Python? Look no further than the Data Science and Python channel, also known as @datasciencet. This channel is dedicated to providing valuable insights, resources, and tips for individuals interested in the fields of data science and machine learning. Whether you're a seasoned professional or just starting out, this channel offers something for everyone.

Stay up to date with the latest trends, tools, and techniques in the world of data science and machine learning. Learn how to harness the power of Python, a versatile and powerful programming language, to analyze data, build predictive models, and extract valuable insights. Connect with like-minded individuals, share your knowledge, and collaborate on exciting projects within the community.

In addition to valuable content, the Data Science and Python channel also offers opportunities to promote your own work or products. If you're interested in advertising on the channel, visit https://telega.io/c/dataScienceT for more information. The channel is managed by Admin @hussein_sheikho, who is dedicated to creating a supportive and engaging community for researchers and programmers alike.

Join the Data Science and Python channel today to take your skills to the next level and unlock new opportunities in the world of data science and machine learning. Whether you're looking to enhance your knowledge, network with professionals, or showcase your expertise, this channel has something for everyone. Don't miss out on this valuable resource for researchers and advanced programmers. Join @datasciencet today!

Data Science | Machine Learning with Python for Researchers

17 Feb, 06:58


follow me on X

i will send useful courses and post

https://x.com/EngSheikho

Data Science | Machine Learning with Python for Researchers

11 Feb, 11:09


SGLang: Efficient Execution of Structured Language Model Programs

12 Dec 2023 · Lianmin Zheng, Liangsheng Yin, Zhiqiang Xie, Chuyue Sun, Jeff Huang, Cody Hao Yu, Shiyi Cao, Christos Kozyrakis, Ion Stoica, Joseph E. Gonzalez, Clark Barrett, Ying Sheng ·

Large language models (LLMs) are increasingly used for complex tasks that require multiple generation calls, advanced prompting techniques, control flow, and structured inputs/outputs. However, efficient systems are lacking for programming and executing these applications. We introduce SGLang, a system for efficient execution of complex language model programs. SGLang consists of a frontend language and a runtime. The frontend simplifies programming with primitives for generation and parallelism control. The runtime accelerates execution with novel optimizations like RadixAttention for KV cache reuse and compressed finite state machines for faster structured output decoding. Experiments show that SGLang achieves up to 6.4x higher throughput compared to state-of-the-art inference systems on various large language and multi-modal models on tasks including agent control, logical reasoning, few-shot learning benchmarks, JSON decoding, retrieval-augmented generation pipelines, and multi-turn chat.

Paper: https://arxiv.org/pdf/2312.07104v2.pdf

Code: https://github.com/sgl-project/sglang

Datasets: MMLU - HellaSwag - LLaVA-Bench

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #LLM #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI #DeepSeek

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

11 Feb, 10:12


One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation

4 Feb 2025 · Jianze Li, JieZhang Cao, Yong Guo, Wenbo Li, Yulun Zhang ·

Diffusion models (DMs) have significantly advanced the development of real-world image super-resolution (Real-ISR), but the computational cost of multi-step diffusion models limits their application. One-step diffusion models generate high-quality images in a one sampling step, greatly reducing computational overhead and inference latency. However, most existing one-step diffusion methods are constrained by the performance of the teacher model, where poor teacher performance results in image artifacts. To address this limitation, we propose FluxSR, a novel one-step diffusion Real-ISR technique based on flow matching models. We use the state-of-the-art diffusion model FLUX.1-dev as both the teacher model and the base model. First, we introduce Flow Trajectory Distillation (FTD) to distill a multi-step flow matching model into a one-step Real-ISR. Second, to improve image realism and address high-frequency artifact issues in generated images, we propose TV-LPIPS as a perceptual loss and introduce Attention Diversification Loss (ADL) as a regularization term to reduce token similarity in transformer, thereby eliminating high-frequency artifacts. Comprehensive experiments demonstrate that our method outperforms existing one-step diffusion-based Real-ISR methods.

Paper: https://arxiv.org/pdf/2502.01993v1.pdf

Code: https://github.com/jianzeli-114/fluxsr

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #LLM #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI #DeepSeek

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

09 Feb, 07:26


please add your freinds and your techers

Data Science | Machine Learning with Python for Researchers

02 Feb, 18:43


IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems

Large Language Models (LLMs) are transforming artificial intelligence, evolving into task-oriented systems capable of autonomous planning and execution. One of the primary applications of LLMs is conversational AI systems, which must navigate multi-turn dialogues, integrate domain-specific APIs, and adhere to strict policy constraints. However, evaluating these agents remains a significant challenge, as traditional methods fail to capture the complexity and variability of real-world interactions. We introduce IntellAgent, a scalable, open-source multi-agent framework designed to evaluate conversational AI systems comprehensively. IntellAgent automates the creation of diverse, synthetic benchmarks by combining policy-driven graph modeling, realistic event generation, and interactive user-agent simulations. This innovative approach provides fine-grained diagnostics, addressing the limitations of static and manually curated benchmarks with coarse-grained metrics. IntellAgent represents a paradigm shift in evaluating conversational AI. By simulating realistic, multi-policy scenarios across varying levels of complexity, IntellAgent captures the nuanced interplay of agent capabilities and policy constraints. Unlike traditional methods, it employs a graph-based policy model to represent relationships, likelihoods, and complexities of policy interactions, enabling highly detailed diagnostics. IntellAgent also identifies critical performance gaps, offering actionable insights for targeted optimization. Its modular, open-source design supports seamless integration of new domains, policies, and APIs, fostering reproducibility and community collaboration. Our findings demonstrate that IntellAgent serves as an effective framework for advancing conversational AI by addressing challenges in bridging research and deployment.

Paper: https://arxiv.org/pdf/2501.11067v1.pdf

Code: https://github.com/plurai-ai/intellagent

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #LLM #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI #DeepSeek

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

02 Feb, 06:54


PaSa: An LLM Agent for Comprehensive Academic Paper Search

We introduce PaSa, an advanced Paper Search agent powered by large language models. PaSa can autonomously make a series of decisions, including invoking search tools, reading papers, and selecting relevant references, to ultimately obtain comprehensive and accurate results for complex scholarly queries. We optimize PaSa using reinforcement learning with a synthetic dataset, AutoScholarQuery, which includes 35k fine-grained academic queries and corresponding papers sourced from top-tier AI conference publications. Additionally, we develop RealScholarQuery, a benchmark collecting real-world academic queries to assess PaSa performance in more realistic scenarios. Despite being trained on synthetic data, PaSa significantly outperforms existing baselines on RealScholarQuery, including Google, Google Scholar, Google with GPT-4 for paraphrased queries, chatGPT (search-enabled GPT-4o), GPT-o1, and PaSa-GPT-4o (PaSa implemented by prompting GPT-4o). Notably, PaSa-7B surpasses the best Google-based baseline, Google with GPT-4o, by 37.78% in recall@20 and 39.90% in recall@50. It also exceeds PaSa-GPT-4o by 30.36% in recall and 4.25% in precision. Model, datasets, and code are available at https://github.com/bytedance/pasa.

Paper: https://arxiv.org/pdf/2501.10120v1.pdf

Code: https://github.com/bytedance/pasa

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #LLM #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI #DeepSeek

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

01 Feb, 07:48


DeepSeek-VL: Towards Real-World Vision-Language Understanding

We present DeepSeek-VL, an open-source Vision-Language (VL) Model designed for real-world vision and language understanding applications. Our approach is structured around three key dimensions: We strive to ensure our data is diverse, scalable, and extensively covers real-world scenarios including web screenshots, PDFs, OCR, charts, and knowledge-based content, aiming for a comprehensive representation of practical contexts. Further, we create a use case taxonomy from real user scenarios and construct an instruction tuning dataset accordingly. The fine-tuning with this dataset substantially improves the model's user experience in practical applications. Considering efficiency and the demands of most real-world scenarios, DeepSeek-VL incorporates a hybrid vision encoder that efficiently processes high-resolution images (1024 x 1024), while maintaining a relatively low computational overhead. This design choice ensures the model's ability to capture critical semantic and detailed information across various visual tasks. We posit that a proficient Vision-Language Model should, foremost, possess strong language abilities. To ensure the preservation of LLM capabilities during pretraining, we investigate an effective VL pretraining strategy by integrating LLM training from the beginning and carefully managing the competitive dynamics observed between vision and language modalities. The DeepSeek-VL family (both 1.3B and 7B models) showcases superior user experiences as a vision-language chatbot in real-world applications, achieving state-of-the-art or competitive performance across a wide range of visual-language benchmarks at the same model size while maintaining robust performance on language-centric benchmarks. We have made both 1.3B and 7B models publicly accessible to foster innovations based on this foundation model.

Paper: https://arxiv.org/pdf/2403.05525v2.pdf

Code: https://github.com/deepseek-ai/deepseek-vl

Datasets: MMLU - GSM8K- HellaSwag

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #LLM #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI #DeepSeek

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

01 Feb, 07:25


DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper: https://arxiv.org/pdf/2401.02954v1.pdf

Code: https://github.com/deepseek-ai/deepseek-llm

Dataset: AlignBench

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #LLM #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI #DeepSeek


https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

01 Feb, 07:23


Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling

Paper: https://arxiv.org/pdf/2501.17811v1.pdf

Code: https://github.com/deepseek-ai/janus

DataSets: #ImageNet - GQA - MM-Vet

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI #DeepSeek


https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

01 Feb, 06:47


🔥🔥🔥 SmolVLM developers have released open source code for training SmolVLM from scratch on 256 H100!

Inspired by DeepSeek R1, they have open-sourced the complete code for training the model and weights!

You can now train any of the SmolVLMs or create your own VLMs!

Starting training for SmolVLM 256M is very simple:
./vision/experiments/pretraining/vloom/tr_341_smolvlm_025b_1st_stage/01_launch . sh

Code: https://github.com/huggingface/smollm/tree/main/vision
SmolVLM: https://github.com/huggingface/smollm/tree/main

#SmolVLM #llm #opensource #ml #ai

Data Science | Machine Learning with Python for Researchers

31 Jan, 08:07


⭐️ Fast Think-on-Graph: Wider, Deeper and Faster Reasoning of Large Language Model on Knowledge Graph

🖥 Github: https://github.com/dosonleung/fasttog

📕 Paper: https://arxiv.org/abs/2501.14300v1

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

30 Jan, 15:19


🐫Tülu 3 (what a name) 405B - ​​another release!

An open source model (and no, it's not a Chinese model) that outperforms the DeepSeek-V3! on multiple benchmarks

Scalable to 405B - ​​with performance on par with GPT-4o and outperforming previous models in the same class.

Blog: https://allenai.org/blog/tulu-3-405B
You can test it here: https://playground.allenai.org/?model=tulu3-405b
Technical report: https://allenai.org/blog/tulu-3-technical
Hugging Face : https://huggingface.co/collections/allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5

#llm #ml #ai #opensource

https://t.me/DataScienceT ❤️

Data Science | Machine Learning with Python for Researchers

30 Jan, 08:52


JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation

We present JanusFlow, a powerful framework that unifies image understanding and generation in a single model. JanusFlow introduces a minimalist architecture that integrates autoregressive language models with rectified flow, a state-of-the-art method in generative modeling. Our key finding demonstrates that rectified flow can be straightforwardly trained within the large language model framework, eliminating the need for complex architectural modifications. To further improve the performance of our unified model, we adopt two key strategies: (i) decoupling the understanding and generation encoders, and (ii) aligning their representations during unified training. Extensive experiments show that JanusFlow achieves comparable or superior performance to specialized models in their respective domains, while significantly outperforming existing unified approaches across standard benchmarks. This work represents a step toward more efficient and versatile vision-language models.

Paper: https://arxiv.org/pdf/2411.07975v1.pdf

Code: https://github.com/deepseek-ai/janus

Datasets: GQA MMBench MM-Vet SEED-Bench

https://t.me/DataScienceT 💚

Data Science | Machine Learning with Python for Researchers

29 Jan, 09:24


Is this the disruption we've been waiting for?

China's open-source model, DeepSeek, is outperforming ChatGPT & Claude in benchmarks—and it's 20-30x cheaper!

📅 Thursday, Jan 30 | 9 PM IST

Join Our FREE Workshop:-
1. Live comparison: DeepSeek vs. ChatGPT in reasoning, coding & math.
2. Cost-saving insights: Why DeepSeek is a game-changer.
3. Build your first DeepSeek-powered app, live!

Register now: https://lu.ma/ael5tq70?tk=23oh65

Join Our Telegram channel;
https://t.me/BuildFastWithAI

Data Science | Machine Learning with Python for Researchers

29 Jan, 07:18


ChatGPT Cheat Sheet for Business - DataCamp

Unlock the full potential of AI with our comprehensive ChatGPT Cheat Sheet for Business! Tailored specifically for professionals and entrepreneurs, this guide offers actionable insights on leveraging ChatGPT to streamline workflows, enhance customer interactions, and drive business growth. Whether you're a marketing specialist, project manager, or CEO, this cheat sheet is your go-to resource for mastering conversational AI.

From crafting compelling content to automating routine tasks, learn how to harness the power of ChatGPT in real-world business scenarios. With clear examples and step-by-step instructions, you’ll be able to integrate ChatGPT seamlessly into your operations, improving efficiency and innovation.

Don’t miss out on staying ahead of the competition by embracing the future of AI-driven solutions!

#ChatGPT #AIforBusiness #DataCamp #CheatSheet #ConversationalAI #BusinessGrowth #Automation #CustomerEngagement #ContentCreation #EfficiencyBoost #Innovation #FutureOfWork #TechTrends #AIInnovation #DigitalTransformation #BusinessSuccess

https://t.me/CodeProgrammer ⭐️

Data Science | Machine Learning with Python for Researchers

28 Jan, 12:01


LOOKING FOR A NEW SOURCE OF INCOME?
Average earnings from 100$ a day

Lisa is looking for people who want to earn money. If you are responsible, motivated and want to change your life. Welcome to her channel.

WHAT YOU NEED TO WORK:
1. phone or computer
2. Free 15-20 minutes a day
3. desire to earn

❗️ Requires 20 people ❗️
Access is available at the link below
👇

https://t.me/+EWM2hR1d_As0ZDA5

Data Science | Machine Learning with Python for Researchers

28 Jan, 07:08


DeepSeek-V3 Technical Report

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. The model checkpoints are available at https://github.com/deepseek-ai/DeepSeek-V3.

Paper: https://arxiv.org/pdf/2412.19437v1.pdf

Code: https://github.com/deepseek-ai/deepseek-v3

Datasets: MMLU - GSM8K

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI #DeepSeek

https://t.me/DataScienceT 😱

Data Science | Machine Learning with Python for Researchers

28 Jan, 07:05


DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper: https://arxiv.org/pdf/2501.12948v1.pdf

Codes:
https://github.com/zhaoolee/garss
https://github.com/deepseek-ai/deepseek-r1

Datasets: MMLU - IFEval - GPQA - MMLU-Pro

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI

https://t.me/DataScienceT ❤️

Data Science | Machine Learning with Python for Researchers

27 Jan, 17:30


⭐️ Fast Think-on-Graph: Wider, Deeper and Faster Reasoning of Large Language Model on Knowledge Graph

🖥 Github: https://github.com/dosonleung/fasttog

📕 Paper: https://arxiv.org/abs/2501.14300v1

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI

https://t.me/DataScienceT ❤️

Data Science | Machine Learning with Python for Researchers

27 Jan, 07:04


🚀 Boost Your IT Exam Prep with SPOTO's FREE Study Materials! 🎉

💡 Ready to Pass Your IT Exam?
SPOTO is here to help you succeed! Get SPOTO FREE IT study materials to jumpstart your certification journey. Whether you're preparing for #Cisco, #AWS, #PMP, #Python, #Excel, #Google, #Microsoft, or other certifications, we've got you covered.

🔗🎒Download Free IT Certs Exam E-book: https://bit.ly/4fJSoLP

🔗👩‍💻Test Your IT Skills for Free: https://bit.ly/3PoKH39

🔗📝Download Free Cloud Certs Study Materials:https://bit.ly/4gI4KWk

🔗📲Contact for 1v1 IT Certs Exam Help: https://wa.link/k0vy3x
🌐📚 JOIN IT Study GROUP👇: https://chat.whatsapp.com/E3Vkxa19HPO9ZVkWslBO8s

Data Science | Machine Learning with Python for Researchers

26 Jan, 15:51


Machine learning and deep learning
@Machine_learn

Large language Model Git

🔺https://t.me/deep_learning_proj

Data Science | Machine Learning with Python for Researchers

25 Jan, 07:27


Click-Calib: A Robust Extrinsic Calibration Method for Surround-View Systems

Surround-View System (SVS) is an essential component in Advanced Driver Assistance System (ADAS) and requires precise calibrations. However, conventional offline extrinsic calibration methods are cumbersome and time-consuming as they rely heavily on physical patterns. Additionally, these methods primarily focus on short-range areas surrounding the vehicle, resulting in lower calibration quality in more distant zones. To address these limitations, we propose Click-Calib, a pattern-free approach for offline SVS extrinsic calibration. Without requiring any special setup, the user only needs to click a few keypoints on the ground in natural scenes. Unlike other offline calibration approaches, Click-Calib optimizes camera poses over a wide range by minimizing reprojection distance errors of keypoints, thereby achieving accurate calibrations at both short and long distances. Furthermore, Click-Calib supports both single-frame and multiple-frame modes, with the latter offering even better results. Evaluations on our in-house dataset and the public WoodScape dataset demonstrate its superior accuracy and robustness compared to baseline methods.

Paper: https://arxiv.org/pdf/2501.01557v2.pdf

Code: https://github.com/lwangvaleo/click_calib

Dataset: WoodScape

#DataScience #ArtificialIntelligence #MachineLearning #PythonProgramming #DeepLearning #AIResearch #BigData #NeuralNetworks #DataAnalytics #NLP #AutoML #DataVisualization #ScikitLearn #Pandas #NumPy #TensorFlow #AIethics #PredictiveModeling #GPUComputing #OpenSourceAI

https://t.me/DataScienceT 👩‍💻

Data Science | Machine Learning with Python for Researchers

25 Jan, 06:32


Search-o1: Agentic Search-Enhanced Large Reasoning Models

Large reasoning models (LRMs) like OpenAI-o1 have demonstrated impressive long stepwise reasoning capabilities through large-scale reinforcement learning. However, their extended reasoning processes often suffer from knowledge insufficiency, leading to frequent uncertainties and potential errors. To address this limitation, we introduce \textbf{Search-o1}, a framework that enhances LRMs with an agentic retrieval-augmented generation (RAG) mechanism and a Reason-in-Documents module for refining retrieved documents. Search-o1 integrates an agentic search workflow into the reasoning process, enabling dynamic retrieval of external knowledge when LRMs encounter uncertain knowledge points. Additionally, due to the verbose nature of retrieved documents, we design a separate Reason-in-Documents module to deeply analyze the retrieved information before injecting it into the reasoning chain, minimizing noise and preserving coherent reasoning flow. Extensive experiments on complex reasoning tasks in science, mathematics, and coding, as well as six open-domain QA benchmarks, demonstrate the strong performance of Search-o1. This approach enhances the trustworthiness and applicability of LRMs in complex reasoning tasks, paving the way for more reliable and versatile intelligent systems.

paper: https://arxiv.org/pdf/2501.05366v1.pdf

Code: https://github.com/sunnynexus/search-o1

Datasets: Natural Questions - TriviaQA - MATH - HotpotQA - GPQA - Bamboogle

#Search_o1 #LargeReasoningModels #AgenticRAG #ReasonInDocuments #DynamicKnowledgeRetrieval #ComplexReasoning #ScienceMathCoding #OpenDomainQA #TrustworthyAI #IntelligentSystems #python

https://t.me/DataScienceT 😱

Data Science | Machine Learning with Python for Researchers

23 Jan, 08:52


Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback

Paper: https://arxiv.org/pdf/2412.15838v2.pdf

Code: https://github.com/pku-alignment/align-anything

Dataset: LLaVA-Bench

https://t.me/DataScienceT 😱

Data Science | Machine Learning with Python for Researchers

23 Jan, 08:50


Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks

paper: https://arxiv.org/pdf/2412.00733v3.pdf

Code: https://github.com/fudan-generative-vision/hallo3

https://t.me/DataScienceT 😮

Data Science | Machine Learning with Python for Researchers

23 Jan, 08:46


Transformers 2: Self-adaptive LLMs

Paper: https://arxiv.org/pdf/2501.06252v2.pdf

Code:
https://github.com/SakanaAI/self-adaptive-llms
https://github.com/codelion/adaptive-classifier

Datasets: GSM8K - HumanEval - MATH
MBPP - TextVQA - OK-VQA - ARC (AI2 Reasoning Challenge)

https://t.me/DataScienceT ❤️

Data Science | Machine Learning with Python for Researchers

22 Jan, 12:40


🎁 Your balance is credited $4,000 , the owner of the channel wants to contact you!

Dear subscriber, we would like to thank you very much for supporting our channel, and as a token of our gratitude we would like to provide you with free access to Lisa's investor channel, with the help of which you can earn today

T.me/Lisainvestor

Be sure to take advantage of our gift, admission is free, don't miss the opportunity, change your life for the better.

You can follow the link :
https://t.me/+-FM_9cBcSGUyZmFh

Data Science | Machine Learning with Python for Researchers

22 Jan, 07:36


Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise

Generative modeling aims to transform random noise into structured outputs. In this work, we enhance video diffusion models by allowing motion control via structured latent noise sampling. This is achieved by just a change in data: we pre-process training videos to yield structured noise. Consequently, our method is agnostic to diffusion model design, requiring no changes to model architectures or training pipelines. Specifically, we propose a novel noise warping algorithm, fast enough to run in real time, that replaces random temporal Gaussianity with correlated warped noise derived from optical flow fields, while preserving the spatial Gaussianity. The efficiency of our algorithm enables us to fine-tune modern video diffusion base models using warped noise with minimal overhead, and provide a one-stop solution for a wide range of user-friendly motion control: local object motion control, global camera movement control, and motion transfer. The harmonization between temporal coherence and spatial Gaussianity in our warped noise leads to effective motion control while maintaining per-frame pixel quality. Extensive experiments and user studies demonstrate the advantages of our method, making it a robust and scalable approach for controlling motion in video diffusion models.

Paper: https://arxiv.org/pdf/2501.08331v2.pdf

Code:
https://github.com/gowiththeflowpaper/gowiththeflowpaper.github.io
https://github.com/vgenai-netflix-eyeline-research/go-with-the-flow

https://t.me/DataScienceT 🌟

Data Science | Machine Learning with Python for Researchers

22 Jan, 06:05


MiniCPM-V: A GPT-4V Level MLLM on Your Phone

The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally reshaped the landscape of #AI research and industry, shedding light on a promising path toward the next AI milestone. However, significant challenges remain preventing MLLMs from being practical in real-world applications. The most notable challenge comes from the huge cost of running an MLLM with a massive number of parameters and extensive computation. As a result, most MLLMs need to be deployed on high-performing cloud servers, which greatly limits their application scopes such as mobile, offline, energy-sensitive, and privacy-protective scenarios. In this work, we present MiniCPM-V, a series of efficient #MLLMs deployable on end-side devices. By integrating the latest MLLM techniques in architecture, pretraining and alignment, the latest MiniCPM-Llama3-V 2.5 has several notable features: (1) Strong performance, outperforming GPT-4V-1106, Gemini Pro and Claude 3 on OpenCompass, a comprehensive evaluation over 11 popular benchmarks, (2) strong #OCR capability and 1.8M pixel high-resolution #image perception at any aspect ratio, (3) trustworthy behavior with low hallucination rates, (4) multilingual support for 30+ languages, and (5) efficient deployment on mobile phones. More importantly, MiniCPM-V can be viewed as a representative example of a promising trend: The model sizes for achieving usable (e.g., GPT-4V) level performance are rapidly decreasing, along with the fast growth of end-side computation capacity. This jointly shows that GPT-4V level MLLMs deployed on end devices are becoming increasingly possible, unlocking a wider spectrum of real-world AI applications in the near future.

Paper: https://arxiv.org/pdf/2408.01800v1.pdf

Codes:
https://github.com/OpenBMB/MiniCPM-o
https://github.com/openbmb/minicpm-v

Datasets: Video-MME

#MachineLearning #DeepLearning #BigData #Datascience #ML #HealthTech #DataVisualization #ArtificialInteligence #SoftwareEngineering #GenAI #deeplearning #ChatGPT #OpenAI #python #AI #keras #SQL #Statistics

https://t.me/DataScienceT ❤️

Data Science | Machine Learning with Python for Researchers

21 Jan, 14:26


🎁 Your balance is credited $4,000 , the owner of the channel wants to contact you!

Dear subscriber, we would like to thank you very much for supporting our channel, and as a token of our gratitude we would like to provide you with free access to Lisa's investor channel, with the help of which you can earn today

T.me/Lisainvestor

Be sure to take advantage of our gift, admission is free, don't miss the opportunity, change your life for the better.

You can follow the link :
https://t.me/+-FM_9cBcSGUyZmFh

Data Science | Machine Learning with Python for Researchers

20 Jan, 19:45


DeepSeek-V3 Technical Report

We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in #DeepSeek V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks. The model checkpoints are available at https://github.com/deepseek-ai/DeepSeek-V3.

Paper: https://arxiv.org/pdf/2412.19437v1.pdf

Code: https://github.com/deepseek-ai/deepseek-v3

#aiagents #ai #llm #ml #machinelearning #python

https://t.me/DataScienceT 💚

Data Science | Machine Learning with Python for Researchers

20 Jan, 19:43


3DGS-to-PC: Convert a 3D Gaussian Splatting Scene into a Dense Point Cloud or Mesh

3D Gaussian Splatting (3DGS) excels at producing highly detailed 3D reconstructions, but these scenes often require specialised renderers for effective visualisation. In contrast, point clouds are a widely used 3D representation and are compatible with most popular 3D processing software, yet converting 3DGS scenes into point clouds is a complex challenge. In this work we introduce 3DGS-to-PC, a flexible and highly customisable framework that is capable of transforming 3DGS scenes into dense, high-accuracy point clouds. We sample points probabilistically from each Gaussian as a 3D density function. We additionally threshold new points using the Mahalanobis distance to the Gaussian centre, preventing extreme outliers. The result is a point cloud that closely represents the shape encoded into the 3D Gaussian scene. Individual Gaussians use spherical harmonics to adapt colours depending on view, and each point may contribute only subtle colour hints to the resulting rendered scene. To avoid spurious or incorrect colours that do not fit with the final point cloud, we recalculate Gaussian colours via a customised image rendering approach, assigning each Gaussian the colour of the pixel to which it contributes most across all views. 3DGS-to-PC also supports mesh generation through Poisson Surface Reconstruction, applied to points sampled from predicted surface Gaussians. This allows coloured meshes to be generated from 3DGS scenes without the need for re-training. This package is highly customisable and capability of simple integration into existing 3DGS pipelines. 3DGS-to-PC provides a powerful tool for converting 3DGS data into point cloud and surface-based formats.

Paper: https://arxiv.org/pdf/2501.07478v1.pdf

Code: https://github.com/lewis-stuart-11/3dgs-to-pc

Dataset: NeRF

https://t.me/DataScienceT 💚

Data Science | Machine Learning with Python for Researchers

20 Jan, 19:39


Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

This work presents Sa2VA, the first unified model for dense grounded understanding of both images and videos. Unlike existing multi-modal large language models, which are often limited to specific modalities and tasks, Sa2VA supports a wide range of image and video tasks, including referring segmentation and conversation, with minimal one-shot instruction tuning. Sa2VA combines SAM-2, a foundation video segmentation model, with LLaVA, an advanced vision-language model, and unifies text, image, and video into a shared LLM token space. Using the LLM, Sa2VA generates instruction tokens that guide SAM-2 in producing precise masks, enabling a grounded, multi-modal understanding of both static and dynamic visual content. Additionally, we introduce Ref-SAV, an auto-labeled dataset containing over 72k object expressions in complex video scenes, designed to boost model performance. We also manually validate 2k video objects in the Ref-SAV datasets to benchmark referring video object segmentation in complex environments. Experiments show that Sa2VA achieves state-of-the-art across multiple tasks, particularly in referring video object segmentation, highlighting its potential for complex real-world applications.

Paper: https://arxiv.org/pdf/2501.04001v1.pdf

Code: https://github.com/magic-research/Sa2VA

Dataset: Visual Question Answering (VQA)

https://t.me/DataScienceT ❤️

Data Science | Machine Learning with Python for Researchers

20 Jan, 19:17


Unlock a treasure trove of knowledge with our exclusive paid channel! For just $2 a month, gain access to thousands of valuable resources, including essential books and premium courses from Coursera and Udemy. Plus, dive into exciting paid projects! Enjoy hassle-free automatic payments via Telegram. Join us today! Link

Data Science | Machine Learning with Python for Researchers

20 Jan, 07:33


Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap

Large language models (LLMs) have not only revolutionized natural language processing but also extended their prowess to various domains, marking a significant stride towards artificial general intelligence. The interplay between LLMs and evolutionary algorithms (EAs), despite differing in objectives and methodologies, share a common pursuit of applicability in complex problems. Meanwhile, EA can provide an optimization framework for LLM's further enhancement under black-box settings, empowering LLM with flexible global search capacities. On the other hand, the abundant domain knowledge inherent in LLMs could enable EA to conduct more intelligent searches. Furthermore, the text processing and generative capabilities of LLMs would aid in deploying EAs across a wide range of tasks. Based on these complementary advantages, this paper provides a thorough review and a forward-looking roadmap, categorizing the reciprocal inspiration into two main avenues: LLM-enhanced EA and EA-enhanced #LLM. Some integrated synergy methods are further introduced to exemplify the complementarity between LLMs and EAs in diverse scenarios, including code generation, software engineering, neural architecture search, and various generation tasks. As the first comprehensive review focused on the EA research in the era of #LLMs, this paper provides a foundational stepping stone for understanding the collaborative potential of LLMs and EAs. The identified challenges and future directions offer guidance for researchers and practitioners to unlock the full potential of this innovative collaboration in propelling advancements in optimization and artificial intelligence.

Paper: https://arxiv.org/pdf/2401.10034v3.pdf

Code: https://github.com/wuxingyu-ai/llm4ec

https://t.me/DataScienceT ⭐️

Data Science | Machine Learning with Python for Researchers

20 Jan, 07:31


Cold-Start Recommendation towards the Era of Large Language Models (LLMs): A Comprehensive Survey and Roadmap

Cold-start problem is one of the long-standing challenges in recommender systems, focusing on accurately modeling new or interaction-limited users or items to provide better recommendations. Due to the diversification of internet platforms and the exponential growth of users and items, the importance of cold-start recommendation (CSR) is becoming increasingly evident. At the same time, large language models (LLMs) have achieved tremendous success and possess strong capabilities in modeling user and item information, providing new potential for cold-start recommendations. However, the research community on CSR still lacks a comprehensive review and reflection in this field. Based on this, in this paper, we stand in the context of the era of large language models and provide a comprehensive review and discussion on the roadmap, related literature, and future directions of CSR. Specifically, we have conducted an exploration of the development path of how existing CSR utilizes information, from content features, graph relations, and domain information, to the world knowledge possessed by large language models, aiming to provide new insights for both the research and industrial communities on CSR. Related resources of cold-start recommendations are collected and continuously updated for the community in https://github.com/YuanchenBei/Awesome-Cold-Start-Recommendation.

Paper: https://arxiv.org/pdf/2501.01945v2.pdf

Code: https://github.com/yuanchenbei/awesome-cold-start-recommendation

https://t.me/DataScienceT 🩷

Data Science | Machine Learning with Python for Researchers

19 Jan, 07:29


https://t.me/datasets1

find your datasets

Data Science | Machine Learning with Python for Researchers

19 Jan, 07:19


The GAN is dead; long live the GAN! A Modern GAN Baseline

There is a widely-spread claim that GANs are difficult to train, and GAN architectures in the literature are littered with empirical tricks. We provide evidence against this claim and build a modern GAN baseline in a more principled manner. First, we derive a well-behaved regularized relativistic GAN loss that addresses issues of mode dropping and non-convergence that were previously tackled via a bag of ad-hoc tricks. We analyze our loss mathematically and prove that it admits local convergence guarantees, unlike most existing relativistic losses. Second, our new loss allows us to discard all ad-hoc tricks and replace outdated backbones used in common GANs with modern architectures. Using StyleGAN2 as an example, we present a roadmap of simplification and modernization that results in a new minimalist baseline -- R3GAN. Despite being simple, our approach surpasses StyleGAN2 on FFHQ, ImageNet, CIFAR, and Stacked MNIST datasets, and compares favorably against state-of-the-art GANs and diffusion models.

Paper: https://arxiv.org/pdf/2501.05441v1.pdf

Code: https://github.com/brownvc/r3gan

Dataset: CIFAR-10

https://t.me/DataScienceT 😵‍💫

Data Science | Machine Learning with Python for Researchers

19 Jan, 07:05


UnCommon Objects in 3D

We introduce Uncommon Objects in 3D (uCO3D), a new object-centric dataset for 3D deep learning and 3D generative AI. uCO3D is the largest publicly-available collection of high-resolution videos of objects with 3D annotations that ensures full-360 coverage. uCO3D is significantly more diverse than MVImgNet and CO3Dv2, covering more than 1,000 object categories. It is also of higher quality, due to extensive quality checks of both the collected videos and the 3D annotations. Similar to analogous datasets, uCO3D contains annotations for 3D camera poses, depth maps and sparse point clouds. In addition, each object is equipped with a caption and a 3D Gaussian Splat reconstruction. We train several large 3D models on MVImgNet, CO3Dv2, and uCO3D and obtain superior results using the latter, showing that uCO3D is better for learning applications.

Paper: https://arxiv.org/pdf/2501.07574v1.pdf

Code: https://github.com/facebookresearch/uco3d

DataSet: MS COCO

https://t.me/DataScienceT 🐻‍❄️

Data Science | Machine Learning with Python for Researchers

19 Jan, 07:02


Tensor Product Attention Is All You Need

Paper: https://arxiv.org/pdf/2501.06425v1.pdf

Code: https://github.com/tensorgi/t6

Dataset: MMLU

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

19 Jan, 03:03


Continual Forgetting for Pre-trained Vision Models (CVPR2024)

🖥 Github: https://github.com/bjzhb666/GS-LoRA

📕 Paper: https://arxiv.org/abs/2501.09705v1

🧠 Dataset: https://paperswithcode.com/dataset/coco

https://t.me/DataScienceT 🧠

Data Science | Machine Learning with Python for Researchers

18 Jan, 07:13


MiniRAG: Towards Extremely Simple Retrieval-Augmented Generation

Paper: https://arxiv.org/pdf/2501.06713v2.pdf

Code: https://github.com/hkuds/minirag

https://t.me/DataScienceT 🧠

Data Science | Machine Learning with Python for Researchers

18 Jan, 07:01


Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

Paper: https://arxiv.org/pdf/2407.15811v1.pdf

code: https://github.com/sonyresearch/micro_diffusion

Datasets: MS COCO

https://t.me/DataScienceT 🧠

Data Science | Machine Learning with Python for Researchers

18 Jan, 06:53


FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors

Paper: https://arxiv.org/pdf/2501.08225v1.pdf

Code: https://github.com/ybybzhang/framepainter

https://t.me/DataScienceT ✈️

Data Science | Machine Learning with Python for Researchers

15 Jan, 18:56


Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding

🖥 Github: https://github.com/opengvlab/piip

📕 Paper: https://arxiv.org/abs/2501.07783v1

⭐️ Dataset: https://paperswithcode.com/dataset/gqa

https://t.me/DataScienceT 🧠

Data Science | Machine Learning with Python for Researchers

14 Jan, 15:13


AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans

Paper: https://arxiv.org/pdf/2407.02418v2.pdf

Code: https://github.com/GabrieleLozupone/AXIAL

Dataset: ADNI

https://t.me/DataScienceT 🧠

Data Science | Machine Learning with Python for Researchers

14 Jan, 15:08


SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models

Paper: https://arxiv.org/pdf/2412.11058v1.pdf

Code:
https://github.com/snowfallingplum/shmt
https://github.com/snowfallingplum/csd-mt

https://t.me/DataScienceT 🐍#️⃣

Data Science | Machine Learning with Python for Researchers

12 Jan, 06:54


Embark on a new beginning by joining our WhatsApp channel.

https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A/110

Data Science | Machine Learning with Python for Researchers

10 Jan, 15:07


Please 2️⃣9️⃣ likes or ⭐️

Data Science | Machine Learning with Python for Researchers

10 Jan, 14:30


💻 ACU - Awesome Agents for Computer Use

A project that contains a carefully selected list of resources about AI agents designed to run autonomously on your computers.

It includes research studies, projects, frameworks, guides and various tools.

Agents support task analysis and decision making functions for interacting with any interface.

▪️ Github

#aiagents #awesome #agents

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

09 Jan, 19:06


Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

🖥 Github: https://github.com/zhouyiks/CoLVA/tree/main

📕 Paper: https://arxiv.org/pdf/2501.04670v1.pdf

⭐️ Dataset: https://paperswithcode.com/dataset/bdd100k

https://t.me/DataScienceT ✉️

Data Science | Machine Learning with Python for Researchers

07 Jan, 12:35


LOOKING FOR A NEW SOURCE OF INCOME?
Average earnings from 100$ a day

Lisa is looking for people who want to earn money. If you are responsible, motivated and want to change your life. Welcome to her channel.

WHAT YOU NEED TO WORK:
1. phone or computer
2. Free 15-20 minutes a day
3. desire to earn

❗️ Requires 20 people ❗️
Access is available at the link below
👇

https://t.me/+aZQRLmmFFbw1NzIx

Data Science | Machine Learning with Python for Researchers

06 Jan, 16:17


This channels is for Programmers, Coders, Software Engineers.

0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages

https://t.me/addlist/8_rRW2scgfRhOTc0

https://t.me/Python53

Data Science | Machine Learning with Python for Researchers

31 Dec, 17:11


DID YOU SEE MR. BEAST’S NEW VIDEO?

Ronaldo shared info about a Telegram channel where you can earn money playing Aviator.

Here’s how:

- Subscribe to the channel

- Register through a unique 1win link
😍

- Deposit 1000+ RS to join Amir Khan’s VIP group
🤑

📱Dont miss the chance to change your life in 2025📱
https://t.me/+rXjMg8HEZrwyZGZh
https://t.me/+rXjMg8HEZrwyZGZh
https://t.me/+rXjMg8HEZrwyZGZh

Data Science | Machine Learning with Python for Researchers

30 Dec, 13:55


Automating the Search for Artificial Life with Foundation Models

paper: https://arxiv.org/pdf/2412.17799v1.pdf

Code: https://github.com/sakanaai/asal

https://t.me/DataScienceT 💙

Data Science | Machine Learning with Python for Researchers

30 Dec, 13:49


CogAgent: A Visual Language Model for GUI Agents

Paper: https://arxiv.org/pdf/2312.08914v3.pdf

CVPR 2024: http://openaccess.thecvf.com//content/CVPR2024/papers/Hong_CogAgent_A_Visual_Language_Model_for_GUI_Agents_CVPR_2024_paper.pdf

Code1: https://github.com/thudm/cogvlm
Code2: https://github.com/digirl-agent/digirl
Code3: https://github.com/THUDM/CogAgent

Dataset: TextVQA

https://t.me/DataScienceT 🩵

Data Science | Machine Learning with Python for Researchers

30 Dec, 13:47


KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation

Paper: https://arxiv.org/pdf/2409.13731v3.pdf

Code: https://github.com/openspg/kag

Dataset: 2WikiMultiHopQA

https://t.me/DataScienceT 💙

Data Science | Machine Learning with Python for Researchers

30 Dec, 12:06


LOOKING FOR A NEW SOURCE OF INCOME?
Average earnings from 100$ a day

Lisa is looking for people who want to earn money. If you are responsible, motivated and want to change your life. Welcome to her channel.

WHAT YOU NEED TO WORK:
1. phone or computer
2. Free 15-20 minutes a day
3. desire to earn

❗️ Requires 20 people ❗️
Access is available at the link below
👇

https://t.me/+FcwoGw3QeO40NmIx

Data Science | Machine Learning with Python for Researchers

29 Dec, 10:51


Large Language Models Course: Learn by Doing LLM Projects

🖥 Github: https://github.com/peremartra/Large-Language-Model-Notebooks-Course

📕 Paper: https://doi.org/10.31219/osf.io/qgxea

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

29 Dec, 08:25


Please run this bot

And please if any one have telegram premium run this bot
https://t.me/rating/app?startapp=ref_dc9b78418788114

Data Science | Machine Learning with Python for Researchers

27 Dec, 12:04


💰FOCUS ON MONEY 💰

Get FREE one effective signal after depositing by following these 3 simple steps:

- Make a deposit using my link💖
- Get a free working signal📈
- Start earning now💸

👇Subscribe and let do some money bhai👇
https://t.me/+rXjMg8HEZrwyZGZh
https://t.me/+rXjMg8HEZrwyZGZh
https://t.me/+rXjMg8HEZrwyZGZh

Data Science | Machine Learning with Python for Researchers

22 Dec, 14:36


Join our paid channel on Telegram!

The channel includes a huge encyclopedia of important books in the fields of data science, artificial intelligence, and data analysis, as well as a large collection of high-rated and high-priced courses.

Choose your plan:

💎 Monthly subscription ($2 - automatic joining after payment is completed)
https://t.me/+r_Tcx2c-oVU1OWNi

💎 Annual subscription ($12)

💎 Lifetime subscription ($17)

For annual and lifetime subscriptions, please contact me at @HusseinSheikho

Data Science | Machine Learning with Python for Researchers

18 Dec, 21:05


🀄 GuoFeng Webnovel: A Discourse-Level and Multilingual Corpus of Web Fiction

🖥 Github: https://github.com/longyuewangdcu/guofeng-webnovel

📕 Paper: https://arxiv.org/abs/2412.11732v1

🌟 Dataset: www2.statmt.org/wmt24/literary-trans

https://t.me/DataScienceT 🏳

Data Science | Machine Learning with Python for Researchers

17 Dec, 12:12


🤑EARN YOUR $100 TODAY! EASY!

Lisa Trader has launched a free marathon on her VIP channel.

Now absolutely everyone can earn from trading. It has become even easier to earn in the cryptocurrency market, you can start today!

WHAT DO YOU NEED TO START?

1. Subscribe to the channel SIGNALS BY LISA TRADER 📈.
2. Write “MARATHON” in private messages. She will then tell you how to get on the vip channel for absolutely FREE!

👉CLICK HERE👈
👉CLICK HERE👈
👉CLICK HERE👈

Data Science | Machine Learning with Python for Researchers

17 Dec, 03:23


⚡️ Byte Latent Transformer: Patches Scale Better Than Tokens

Byte Latent Transformer architecture (BLTs), a new byte-level LLM architecture that for the first time, matches tokenization-based LLM performance at scale, with significant improvements in inference efficiency and robustness.

🖥 Github: https://github.com/facebookresearch/blt

📕 Paper: https://arxiv.org/abs/2412.09871v1

🌟 Dataset: https://paperswithcode.com/dataset/mmlu

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

16 Dec, 06:18


OASIS Alzheimer's Detection

Large-scale brain MRI dataset for deep neural network analysis

About Dataset
The dataset used is the OASIS MRI dataset (https://sites.wustl.edu/oasisbrains/), which consists of 80,000 brain MRI images. The images have been divided into four classes based on Alzheimer's progression. The dataset aims to provide a valuable resource for analyzing and detecting early signs of Alzheimer's disease.

To make the dataset accessible, the original .img and .hdr files were converted into Nifti format (.nii) using FSL (FMRIB Software Library). The converted MRI images of 461 patients have been uploaded to a GitHub repository, which can be accessed in multiple parts.
For the neural network training, 2D images were used as input. The brain images were sliced along the z-axis into 256 pieces, and slices ranging from 100 to 160 were selected from each patient. This approach resulted in a comprehensive dataset for analysis.

Patient classification was performed based on the provided metadata and Clinical Dementia Rating (CDR) values, resulting in four classes: demented, very mild demented, mild demented, and non-demented. These classes enable the detection and study of different stages of Alzheimer's disease progression.

During the dataset preparation, the .nii MRI scans were converted to .jpg files. Although this conversion presented some challenges, the files were successfully processed using appropriate tools. The resulting dataset size is 1.3 GB.

https://t.me/datasets1 🌟

Data Science | Machine Learning with Python for Researchers

15 Dec, 11:36


2DMatGMM: An open-source robust machine learning platform for real-time detection and classification of 2D material flakes

🖥 Github: https://github.com/jaluus/2dmatgmm

📕 Paper: https://arxiv.org/abs/2412.09333v1

⭐️ Dataset: https://paperswithcode.com/task/instance-segmentation

https://t.me/DataScienceT 🏳

Data Science | Machine Learning with Python for Researchers

11 Dec, 18:26


🌟 BioNeMo: A Framework for Developing AI Models for Drug Design.

NVIDIA BioNeMo2 Framework is a set of tools, libraries, and models for computational drug discovery and design.

It accelerates the most time-consuming and expensive steps in building and adapting biomolecular AI models by providing optimized models and tools that are easily integrated into GPU-based computing resources.

The framework enables the creation, training and tuning of models, and its capabilities span a variety of workloads and therapeutic mechanisms: molecule generation, protein structure prediction, protein-ligand prediction and representation learning.

In addition to pipeline code, scripts and utilities, BioNeMo2 Framework contains:

▶️ Pre-trained models:

🟢 ESM-2 is a pre-trained bidirectional encoder (BERT-like) for amino acid sequences. BioNeMo2 includes checkpoints with parameters 650M and 3B;

🟢 Geneformer is a tabular scoring model that generates a dense representation of a cell's scRNA by examining co-expression patterns in individual cells.


▶️ Datasets:

🟠 CELLxGENE is a collection of publicly available single-cell datasets collected by the CZI (Chan Zuckerberg Initiative) with a total volume of 24 million cells;


🟠 UniProt is a database of clustered sets of protein sequences from UniProtKB, created on the basis of translated genomic data.


📌 Licensing: Apache 2.0 License.


🟡 Project page
🟡 Documentation
🖥 GitHub

#AI #ML #Framework #NVIDIA

Data Science | Machine Learning with Python for Researchers

10 Dec, 03:42


This channels is for Programmers, Coders, Software Engineers.

0️⃣ Python
1️⃣ Data Science
2️⃣ Machine Learning
3️⃣ Data Visualization
4️⃣ Artificial Intelligence
5️⃣ Data Analysis
6️⃣ Statistics
7️⃣ Deep Learning
8️⃣ programming Languages

https://t.me/addlist/8_rRW2scgfRhOTc0

https://t.me/Python53

Data Science | Machine Learning with Python for Researchers

08 Dec, 10:18


❇️ AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction 🔥


🔗 Discover More:
  *  Github Link
  *  Project Page: AniGS
  *  Paper: Read the paper

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

03 Dec, 13:17


📈How to make $15,000 in a month in 2024?

Easy!!! Lisa is now the hippest trader who is showing crazy results in the market!

She was able to make over $15,000 in the last month! ❗️

Right now she has started a marathon on her channel and is running it absolutely free. 💡

To participate in the marathon, you will need to :

1. Subscribe to the channel SIGNALS BY LISA TRADER 📈
2. Write in private messages : “Marathon” and start participating!

👉CLICK HERE👈

Data Science | Machine Learning with Python for Researchers

03 Dec, 08:15


🌟 INTELLECT-1: Release of the first decentralized learning model.

PRIME Intellect has published INTELLECT-1 ( Instruct + Base ), the first 10 billion parameter language model collaboratively trained in 50 days by 30 participants worldwide.

PRIME Intellect used its own PRIME platform, designed to address the main problems of decentralized learning: network unreliability and dynamic management of computing nodes.

The platform utilized a network of 112 H100 GPUs across 3 continents and achieved a compute utilization rate of 96% under optimal conditions.

The training corpus consisted of 1 trillion public dataset tokens with the following percentage distribution: 55% fineweb-edu, 10% fineweb, 20% Stack V1, 10% dclm-baseline, 5% open-web-math.

▶️ Technical specifications:

🟢 Parameters: 10B;
🟢 Layers: 42;
🟢 Attention Heads: 32;
🟢 Hidden Size: 4096;
🟢 Context Length: 8192;
🟢 Vocabulary Size: 128256.

INTELLECT-1 achieved 37.5% accuracy on the MMLU test and 72.26% on HellaSwag, and outperformed several other open-source models on WinoGrande with a score of 65.82%.

While these figures lag slightly behind today's popular models, the results of the experiment are a critical step toward democratizing AI development and preventing the consolidation of AI capabilities within a few organizations.

▶️ GGUF quantized versions of INTELLECT-1_Instruct in 3-bit (5.46 GB) to 8-bit (10.9 GB) bit depths from the LM Studio community.

▶️ Example of inference on Transformers:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

torch.set_default_device("cuda")
model = AutoModelForCausalLM.from_pretrained("PrimeIntellect/INTELLECT-1")
tokenizer = AutoTokenizer.from_pretrained("PrimeIntellect/INTELLECT-1")

input_text = "%prompt%"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output_ids = model.generate(input_ids, max_length=50, num_return_sequences=1)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(output_text)


📌 Licensing: Apache 2.0 License.


🟡 Article
🟡 HF Model Kit
🟡 Set of GGUF versions
🟡 Technical report
🟡 Demo
🖥 GitHub

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

02 Dec, 12:16


📈How to make $15,000 in a month in 2024?

Easy!!! Lisa is now the hippest trader who is showing crazy results in the market!

She was able to make over $15,000 in the last month! ❗️

Right now she has started a marathon on her channel and is running it absolutely free. 💡

To participate in the marathon, you will need to :

1. Subscribe to the channel SIGNALS BY LISA TRADER 📈
2. Write in private messages : “Marathon” and start participating!

👉CLICK HERE👈

Data Science | Machine Learning with Python for Researchers

02 Dec, 09:33


OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images


Publication date:
IEEE Transactions on Geoscience and Remote Sensing 2024

Topic: Object detection

Paper
: https://arxiv.org/pdf/2409.19648v1.pdf

GitHub: https://github.com/wokaikaixinxin/OrientedFormer

Description:

In this paper, we propose an end-to-end transformer-based oriented object detector, consisting of three dedicated modules to address these issues. First, Gaussian positional encoding is proposed to encode the angle, position, and size of oriented boxes using Gaussian distributions. Second, Wasserstein self-attention is proposed to introduce geometric relations and facilitate interaction between content and positional queries by utilizing Gaussian Wasserstein distance scores. Third, oriented cross-attention is proposed to align values and positional queries by rotating sampling points around the positional query according to their angles.

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

30 Nov, 18:33


⭐️ Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

RAG-Diffusion now supports FLUX.1 Redux!

🔥 Ready to take control? Customize your region-based images with our training-free solution and achieve powerful, precise results!

🔗 Code: https://github.com/NJU-PCALab/RAG-Diffusion

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

27 Nov, 12:23


❗️ WITH LISA YOU WILL START EARNING MONEY

Lisa will leave a link with free entry to a channel that draws money every day. Each subscriber gets between $100 and $5,000.

👉🏻CLICK HERE TO JOIN THE CHANNEL 👈🏻
👉🏻CLICK HERE TO JOIN THE CHANNEL!👈🏻
👉🏻CLICK HERE TO JOIN THE CHANNEL 👈🏻

🚨FREE FOR THE FIRST 500 SUBSCRIBERS ONLY!

Data Science | Machine Learning with Python for Researchers

27 Nov, 06:00


O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?

🖥 Github: https://github.com/gair-nlp/o1-journey

📕 Paper: https://arxiv.org/abs/2411.16489v1

🌟 Dataset: https://paperswithcode.com/dataset/lima

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

22 Nov, 10:34


I highly recommend downloading the app, there is a solid guide to mastering AI.

Data Science | Machine Learning with Python for Researchers

22 Nov, 07:00


Hey guys,

As you all know, the purpose of this community is to share notes and grow together. Hence, today I am sharing with you an app called DevBytes. It keeps you updated about dev and tech news.

This brilliant app provides curated, bite-sized updates on the latest tech news/dev content. Whether it’s new frameworks, AI breakthroughs, or cloud services, DevBytes brings the essentials straight to you.

If you're tired of information overload and want a smarter way to stay informed, give DevBytes a try.

Download here: https://play.google.com/store/apps/details?id=com.candelalabs.devbytes&hl=en-IN
It’s time to read less and know more!

Data Science | Machine Learning with Python for Researchers

21 Nov, 05:43


Explore "Pretraining LLMs," a short course developed with upstageai.

The course covers pretraining from scratch, continuing pretraining on custom data, and how using smaller open-source models can reduce costs.

Take the course for free:
https://hubs.la/Q02YFKyx0

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

20 Nov, 12:03


📈How to make $15,000 in a month in 2024?

Easy!!! Lisa is now the hippest trader who is showing crazy results in the market!

She was able to make over $15,000 in the last month! ❗️

Right now she has started a marathon on her channel and is running it absolutely free. 💡

To participate in the marathon, you will need to :

1. Subscribe to the channel SIGNALS BY LISA TRADER 📈
2. Write in private messages : “Marathon” and start participating!

👉CLICK HERE👈

Data Science | Machine Learning with Python for Researchers

20 Nov, 09:53


🧹🪣 MOP+MiHo+NCC 🖼️👀: Image Matching Filtering and Refinement by Planes and Beyond

🖥 Github: https://github.com/fb82/miho

📕 Paper: https://arxiv.org/abs/2411.09484v1

🌟 Dataset: https://paperswithcode.com/dataset/scannet

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

16 Nov, 05:45


OpenCoder doesn't get enough love

They open-sourced the entire pipeline to create QwenCoder-level code models.

This includes:
- Large datasets
- High-quality models
- Eval framework

Tons of great lessons and observations in the paper

📝 Paper: arxiv.org/abs/2411.04905

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

14 Nov, 14:11


Coursera has launched a collaboration with the MAJOR platform to enable students to self-fund using the MAJOR platform.

Students can now access free Coursera scholarships through MAJOR.

Don't miss the opportunity: Click here.

Data Science | Machine Learning with Python for Researchers

14 Nov, 06:53


Most classical ML algorithms cannot be trained with a batch implementation.

This is concerning because enterprises typically deal with tabular data and classical ML algorithms, such as tree-based methods, are frequently used for modeling.

For instance, to train a random forest from sklearn, the entire dataset must be present in memory. This limits its usage to only small/intermediate datasets.

There are two ways to extend random forests to large datasets.

1) Use big-data frameworks like Spark MLlib to train them.

2) Use random patches, which I learned from the PhD thesis of Dr. Gilles Louppe — Understanding Random Forests.

> Here’s what he proposed.

Note: This approach only works in an ensemble setting. So, you would have to train multiple models.

The idea is to sample random data patches (both rows and columns) and train a decision tree model on the patch.

Repeat this step multiple times to obtain the entire random forest model.

> Here's why it works.

The core objective of Bagging is to build trees that are as different as possible.

In this case, the dataset overlap between any two trees is NOT expected to be huge compared to the typical random forest. This aids in the Bagging objective.

His thesis presented benchmarks on 13 datasets:
- Random patches performed better than the random forest on 11 datasets.
- On the other two datasets, the difference was quite small (~0.05).

And this is how we can train a random forest model on large datasets that do not fit into memory.

https://t.me/DataScienceT ⭐️

Data Science | Machine Learning with Python for Researchers

13 Nov, 12:00


🤑EARN YOUR $100 TODAY! EASY!

Lisa Trader has launched a free marathon on her VIP channel.

Now absolutely everyone can earn from trading. It has become even easier to earn in the cryptocurrency market, you can start today!

WHAT DO YOU NEED TO START?

1. Subscribe to the channel SIGNALS BY LISA TRADER 📈.
2. Write “MARATHON” in private messages. She will then tell you how to get on the vip channel for absolutely FREE!

👉CLICK HERE👈
👉CLICK HERE👈
👉CLICK HERE👈

Data Science | Machine Learning with Python for Researchers

12 Nov, 14:44


OmniGen: Unified Image Generation

Paper: https://arxiv.org/pdf/2409.11340v1.pdf

Code: https://github.com/vectorspacelab/omnigen

Datasets: DreamBooth - MagicBrush

https://t.me/DataScienceT ⭐️

Data Science | Machine Learning with Python for Researchers

12 Nov, 14:42


Docling Technical Report

Paper: https://arxiv.org/pdf/2408.09869v3.pdf

Code 1: https://github.com/DS4SD/docling
Code 2: https://github.com/DS4SD/docling-core

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

11 Nov, 20:25


📌 Practical exercises and additional materials for the book "Build a Large Language Model (From Scratch)"

A Github repository with practical exercises, notebooks with code for developing, pre-training, and fine-tuning a GPT-type LLM model based on one of the best books on building an LLM from scratch.

▶️ About the book:
In this book, you will learn and understand how large language models work from the inside, creating your own LLM step by step, with a detailed explanation of each stage in clear language, diagrams and examples.

The method described in the book demonstrates the approach used to create large fundamental models such as those underlying ChatGPT.

In the repository, each chapter of the book has several (3-4) applied examples in ipynb format or as an executable python script. The code is aimed at a wide audience, is designed to run on regular laptops and does not require specialized equipment.

▶️ The main value of the repository is additional practical materials that will help you to study in more depth the subtleties and nuances of the process of setting up and learning LLM:

Setting

🟢 Tips on Setting Up Python
🟢 Installing Python Packages and Libraries
🟢 Docker Environment Setup Guide

Chapter 2: Working with Text Data

🟠 Comparison of different implementations of Byte Pair Encoding (BPE)
🟠 Understanding the difference between embedding and line layers
🟠 Dataloader Intuition with Prime Numbers

Chapter 3: Code of Attention Mechanisms

🟢 Comparison of Effective Implementations of Multi-Head Attention
🟢 PyTorch Buffers

Chapter 4: Implementing the GPT Model from Scratch

🟠 FLOPS Analysis

Chapter 5: Pre-training on unlabeled data

🟢 Alternative Loading of HuggingFace Scales Using Transformers
🟢 Pre-training GPT on the Project Gutenberg dataset
🟢 Adding more features to the learning cycle
🟢 Hyperparameter optimization for pretraining
🟢 Creating a user interface for interacting with LLM
🟢 Convert GPT to Llama
🟢 Llama 3.2 from scratch
🟢 Memory-efficient model loading

Chapter 6: Fine-tuning for Classification

🟠 More experiments on fine-tuning the different layers and using larger models
🟠 Fine-tuning various models based on a 50K row IMDB movie review dataset.
🟠 Building a User Interface for Interacting with a GPT-Based Spam Classifier

Chapter 7: Fine-tuning to Follow Instructions

🟢 Dataset utilities for finding close duplicates and creating passive voice entries
🟢 Evaluating responses to instructions using OpenAI and Ollama APIs
🟢 Creating a dataset for fine-tuning instructions
🟢 Improving the dataset for fine-tuning instructions
🟢 Creating a Preference Dataset with Llama 3.1 70B and Ollama
🟢 DPO for LLM Alignment procedure
🟢 Creating a user interface for interacting with a GPT model with fine-tuning of instructions

🖥 Github

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

11 Nov, 18:54


A promising digital wallet will distribute $40 for free to every user who creates an account on this wallet

Terms of creating an account: Subscribe to their channel only.

https://t.me/TronKeeperBot/app?startapp=418788114

Data Science | Machine Learning with Python for Researchers

07 Nov, 07:06


Constrained Diffusion Implicit Models!

We use diffusion models to solve noisy inverse problems like inpainting, sparse-recovery, and colorization. 10-50x faster than previous methods!

Paper: arxiv.org/pdf/2411.00359

Demo: https://t.co/m6o9GLnnZF

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

06 Nov, 12:07


🎁 Your balance is credited $4,000 , the owner of the channel wants to contact you!

Dear subscriber, we would like to thank you very much for supporting our channel, and as a token of our gratitude we would like to provide you with free access to Lisa's investor channel, with the help of which you can earn today

T.me/Lisainvestor

Be sure to take advantage of our gift, admission is free, don't miss the opportunity, change your life for the better.

You can follow the link :
https://t.me/+j4-NLonPlWJmZDVh

Data Science | Machine Learning with Python for Researchers

06 Nov, 08:23


🔦 Biggest Sale Of The Year NOW ON 🔦Double 11 Shopping Festival Event is live! Check out your most loved for less. 🛍️

Enjoy SPOTO Double 11 Crazy Sale to Join Lucky Draw and win gifts worth up to $1000!💸
🎁⏯️: https://www.spotoexam.com/snsdouble11sale2024/?id=snstxrbzhussein

🔗📝Test Your IT Skills for Free: https://bit.ly/48q8Cb3

🔗📲Contact for 1v1 IT Certs Exam Help: https://wa.link/k0vy3x
🌐📚 JOIN IT Study GROUP to Get Madness Discount 👇: https://chat.whatsapp.com/HqzBlMaOPci0wYvkEtcCDa

Data Science | Machine Learning with Python for Researchers

02 Nov, 04:49


Don’t sleep on Vision Language Models (VLMs).

With the releases of Llama 3.2 and ColQwen2, multimodal models are gaining more and more traction.

VLMs are multimodal models that can handle image and text modalities:

Input: Image and text
Output: Text

They can be used for many use cases, including visual question answering or document understanding (as in the case of ColQwen2).

How do they work under the hood?

The main challenge in VLMs is to unify the image and text representations.

For this, a typical VLM architecture consists of the following components:

• image encoder (e.g., CLIP, SigLIP)
• embedding projector to align image and text representations
• text decoder (e.g., Vicuna, Gemma)

huggingface.co/blog/vlms

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

29 Oct, 19:27


📖 LLM-Agent-Paper-List is a repository of papers on the topic of agents based on large language models (LLM)! The papers are divided into categories such as LLM agent architectures, autonomous LLM agents, reinforcement learning (RL), natural language processing methods, multimodal approaches and tools for developing LLM agents, and more.

🖥 Github

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

27 Oct, 16:20


SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

🖥 Github: https://github.com/mark12ding/sam2long

📕 Paper: https://arxiv.org/abs/2410.16268v1

🤗 HF: https://huggingface.co/papers/2410.16268

Data Science | Machine Learning with Python for Researchers

23 Oct, 12:05


🚨With me you will make money! I have made over $20,000 in the last week! 🔥

I don't care where you are and what you can do, I will help absolutely everyone earn money.

My name is Lisa and:
✔️ I will teach you trading for FREE in a short period of time
✔️ I will give you FREE signals every day
✔️ I will help you to get income of 1,000$ in a week

Sounds unbelievable?

You have 2 hours to join our channel.

But it’s true - just look at the results in my channel and JOIN FOR FREE 👉🏻 https://t.me/+fJ0XM3sZkaxkNjgx

Data Science | Machine Learning with Python for Researchers

21 Oct, 06:22


Benchmarking Agentic Workflow Generation"! ⭐️

ArXiv:
https://arxiv.org/abs/2410.07869

Website:
https://www.zjukg.org/project/WorFBench/

Data:
https://huggingface.co/collections/zjunlp/worfbench-66fc28b8ac1c8e2672192ea1

Github:
https://github.com/zjunlp/WorFBench

https://t.me/DataScienceT

Data Science | Machine Learning with Python for Researchers

20 Oct, 08:42


estimating body and hand motion from a pair of glasses 🤓

website:
http://egoallo.github.io

code:
http://github.com/brentyi/egoallo

https://t.me/DataScienceT 🏵

Data Science | Machine Learning with Python for Researchers

16 Oct, 12:08


LOOKING FOR A NEW SOURCE OF INCOME?
Average earnings from 100$ a day

Lisa is looking for people who want to earn money. If you are responsible, motivated and want to change your life. Welcome to her channel.

WHAT YOU NEED TO WORK:
1. phone or computer
2. Free 15-20 minutes a day
3. desire to earn

❗️ Requires 20 people ❗️
Access is available at the link below
👇

https://t.me/+NhwYZAXFlT8yZDIx

Data Science | Machine Learning with Python for Researchers

15 Oct, 20:42


Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts

💻 Github: https://github.com/freedomintelligence/apollomoe

🔖 Paper: https://arxiv.org/abs/2410.10626v1

🤗 Dataset: https://paperswithcode.com/dataset/mmlu

https://t.me/DataScienceT 🏵

Data Science | Machine Learning with Python for Researchers

12 Oct, 11:28


Generalizable and Animatable Gaussian Head Avatar

🖥 Github: https://github.com/xg-chu/gagavatar

📕 Paper: https://arxiv.org/abs/2410.07971v1

https://t.me/DataScienceT 🏵