PanGu-ฮฃ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

PanGu-ฮฃ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Large language models have improved natural language understanding, generation, and reasoning. A system was developed that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework. The language model was named PanGu-{\Sigma} and had 1.085T parameters. Random Routed Experts (RRE) was used to extend the dense Transformer model to a sparse one....

March 20, 2023 ยท 1182 words ยท Xiaozhe Ren, Pingyi Zhou, Xinfan Meng, Xinjing Huang, Yadao Wang and 12 others
NeRF-LOAM: Neural Implicit Representation for Large-Scale Incremental LiDAR Odometry and Mapping

NeRF-LOAM: Neural Implicit Representation for Large-Scale Incremental LiDAR Odometry and Mapping

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Simultaneous odometry and mapping using LiDAR data is important for mobile systems to achieve full autonomy in large-scale environments. Most existing LiDAR-based methods prioritize tracking quality over reconstruction quality. A novel NeRF-based LiDAR odometry and mapping approach is proposed, consisting of three modules. The approach is pre-trained free and exhibits strong generalization abilities....

March 19, 2023 ยท 1000 words ยท Junyuan Deng, Xieyuanli Chen, Songpengcheng Xia, Zhen Sun, Guoqing Liu and 2 others
Improving Uncertainty Quantification of Deep Classifiers via Neighborhood Conformal Prediction: Novel Algorithm and Theoretical Analysis

Improving Uncertainty Quantification of Deep Classifiers via Neighborhood Conformal Prediction: Novel Algorithm and Theoretical Analysis

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Uncertainty quantification is necessary for safe deployment of deep neural networks. Conformal prediction is a framework for uncertainty quantification of deep models. Neighborhood Conformal Prediction (NCP) is a novel algorithm to improve the efficiency of uncertainty quantification. NCP uses the learned representation of the neural network to create adaptive prediction sets. Experiments show that NCP leads to significant reduction in prediction set size....

March 19, 2023 ยท 808 words ยท Subhankar Ghosh, Taha Belkhouja, Yan Yan, Janardhan Rao Doppa
Two Kinds of Recall

Two Kinds of Recall

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Pattern-based models are good at precision, while learning based models are better at recall. There are two kinds of recall: d-recall (diversity) and e-recall (exhaustiveness). Neural methods are better at d-recall, but pattern-based methods can be better at e-recall. Evaluations should aim for both kinds of recall. Paper Content Introduction Pattern-based methods are more precise, while learning-based methods have better recall Recent advances in neural-network based models have made learning-based methods more precise There are two kinds of recall: diversity and exhaustiveness Pattern-based methods are better at exhaustiveness, while learning-based methods are better at diversity Current datasets and evaluation methods focus primarily on diversity recall Background Dependency trees and syntactic patterns Extractive question answering and the squad dataset SQuAD is a collection of over 150,000 <question, passage> pairs The dataset is used to train machine learning models to perform extractive QA SQuAD v 2....

March 19, 2023 ยท 610 words ยท Yoav Goldberg
Can AI-Generated Text be Reliably Detected?

Can AI-Generated Text be Reliably Detected?

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract LLMs can perform well on various tasks Unregulated use of LLMs can lead to malicious consequences Detection of AI-generated text is critical Recent works attempt to detect AI-generated text Paraphrasing attacks can break detectors Theoretical impossibility result indicates best-possible detector can only perform marginally better than random classifier LLMs protected by watermarking schemes can be vulnerable to spoofing attacks Paper Content Introduction Artificial Intelligence (AI) has made advances in computer vision and natural language processing Large Language Models (LLMs) can generate texts of high quality with potential applications AI tools can be misused for unethical purposes such as plagiarism, fake news, and social engineering Recent research focuses on detecting AI-generated texts Detection works study this problem as a binary classification problem Zero-shot AI text detection without additional training overhead is also studied Watermarking is used to ease the detection of LLM output text AI-text detectors are not reliable in practical scenarios Paraphrasing attack can evade various types of detectors Best-possible detector can perform only marginally better than a random classifier Spoofing attacks on text generative models are possible Identifying AI-generated text is important to avoid misuse Vulnerable detectors can cause damages such as falsely accusing a human of plagiarism Evading ai-detectors using paraphrasing attacks Detecting AI-generated text is important for LLM security AI text detectors can identify LLM signatures in text Paraphrasing attacks can remove these signatures without changing the meaning of the text Detectors face a trade-off between minimizing type-I and type-II errors Paraphrasing attacks on watermarked ai-generated text Experiments performed on soft watermarking scheme proposed in Kirchenbauer et al....

March 17, 2023 ยท 946 words ยท Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi
On the De-duplication of LAION-2B

On the De-duplication of LAION-2B

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Generative models have implications beyond computer science LAION-2B is a large image database with two billion images Manual inspection and automated analysis of LAION-2B is difficult Duplicated images in LAION-2B pose copyright problems Algorithmic chain proposed to detect duplicates in LAION-2B 30% of LAION-2B images likely duplicated Histograms of duplication used to reveal more examples of verbatim copies De-duplicated set will be distributed online Paper Content Introduction AI models have a societal impact beyond computer science Large image databases have improved computer vision LAION-5B is a publicly available dataset with billions of image-caption pairs Datasets are collected via automated web scrapers Duplicate images can cause problems Retrieval systems are used to find duplicates in large datasets Related work CLIP network has achieved SOTA performance on zero-shot and transfer tasks Employs contrastive loss to align image and text feature representations Used to condition text-to-image models Open source repository OpenCLIP has reproduced results of original CLIP paper Carlini et al....

March 17, 2023 ยท 681 words ยท Ryan Webster, Julien Rabin, Loic Simon, Frederic Jurie
A Recipe for Watermarking Diffusion Models

A Recipe for Watermarking Diffusion Models

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Diffusion models (DMs) have potential for generative tasks. Watermarking is a solution for copyright protection and content monitoring in DMs. A recipe for efficiently watermarking state-of-the-art DMs is provided. Paper Content Introduction DMs have demonstrated impressive performance on generative tasks DMs have advantages over other generative models Growing interest in controllable generation has led to the creation of large-scale DMs Legal issues arise with the use of DMs, such as copyright protection and detecting generated content Watermarks have been used to protect copyright and detect fake content This paper develops two watermarking pipelines for DMs Ablation studies are conducted to investigate the possibility of watermarking DMs Related work Diffusion models (DMs) are generative learning approaches used in image generation....

March 17, 2023 ยท 922 words ยท Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Ngai-Man Cheung and 1 others
GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models

GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract GPT models and related technologies could have implications on the US labor market. A new rubric was used to assess occupations based on their correspondence with GPT capabilities. 80% of the US workforce could have at least 10% of their work tasks affected by GPTs. 19% of workers may see at least 50% of their tasks impacted....

March 17, 2023 ยท 1727 words ยท Tyna Eloundou, Sam Manning, Pamela Mishkin, Daniel Rock
A Robustness Analysis of Blind Source Separation

A Robustness Analysis of Blind Source Separation

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Blind source separation (BSS) is a problem of recovering an unobserved signal from its mixture. A framework is presented for analyzing violations of statistical prior assumptions and quantifying their impact on the recovery of the signal. The behaviour of a generic BSS-solution is analysed in terms of explicit continuity guarantees with respect to an informative topology....

March 17, 2023 ยท 1735 words ยท Alexander Schell
$ฮฑ$Surf: Implicit Surface Reconstruction for Semi-Transparent and Thin Objects with Decoupled Geometry and Opacity

$ฮฑ$Surf: Implicit Surface Reconstruction for Semi-Transparent and Thin Objects with Decoupled Geometry and Opacity

Link to paper The full paper is available here. You can also find the paper on PapersWithCode here. Abstract Signed distance function (SDF) is a promising approach for image-based surface reconstruction Existing optimization methods assume solid surfaces and cannot reconstruct semi-transparent surfaces and thin structures Neural radiance field (NeRF) based methods can model semi-transparency but cannot be easily converted into surfaces without introducing artifacts $\alpha$Surf is a novel surface representation with decoupled geometry and opacity for the reconstruction of semi-transparent and thin surfaces Ray-surface intersections can be found in closed-form via analytical solutions of cubic polynomials $\alpha$Surf can accurately reconstruct surfaces with semi-transparent and thin parts with fewer artifacts Paper Content Introduction Recovering surfaces from RGB images is a complex and challenging task in computer vision....

March 17, 2023 ยท 1071 words ยท Tianhao Wu, Hanxue Liang, Fangcheng Zhong, Gernot Riegler, Shimon Vainer and 1 others