- AI Breakfast
- Posts
- ChatGPT Gets New Features
ChatGPT Gets New Features
Plus an AI tool for Audio and Music
Good morning. It’s Friday, August 4th.
Did you know: The New York Times picked up an article on LK-99, the supposed ambient-temperature superconductor that is currently under scrutiny by physicists online.
In today’s email:
AI Models and Development
Business and Investments
Social Media and Entertainment
Environment and Science
5 New AI Tools
Latest AI Research Papers
You read. We listen. Let us know what you think of this edition by replying to this email, or DM us on Twitter.
Today’s edition is brought to you by:
Test against an AI digital twin of your audience and refine your text without wasting your budget.
How it works:
Identify your audience: Choose from a digital twin that we’ve already built, or create one of your own with any text data you can provide.
Predict consumer behavior: Measure the resonance between your text and audience with a unified metric that is correlatively accurate to the real world.
Refine your messaging: Improve the performance of any text without spending money on multivariate testing, development, or hunches.
Thank you for supporting our sponsors!
Today’s trending AI news stories
AI Models and Development
ChatGPT is Rolling Out a Bunch of Small Updates to Improve the Experience: OpenAI is set to make significant improvements to the ChatGPT experience with a series of updates. The updates, rolling out over the next week, include the addition of prompt examples to help users get started, suggested replies for deeper conversations, and defaulting to GPT-4 for Plus users. Users will also be able to upload multiple files for analysis and enjoy better login retention. Additionally, keyboard shortcuts will facilitate faster workflow.
Alibaba rolls out open-sourced AI model to take on Meta's Llama 2: Alibaba Cloud has released two open-sourced AI models, Qwen-7B and Qwen-7B-Chat, each with 7 billion parameters, aiming to challenge Meta’s Llama 2. The move follows Meta’s similar open-sourced model and aims to compete with AI models from OpenAI and Google, potentially reducing their market dominance. Alibaba Cloud’s models are freely accessible to academics, researchers, and commercial institutions worldwide, with licensing requirements for companies with large user bases.
IBM and NASA Open Source Largest Geospatial AI Foundation Model on Hugging Face: IBM and Hugging Face have taken a major stride in democratizing AI access for tackling climate change. The tech giant and open-source AI platform are releasing the largest geospatial AI foundation model, constructed from NASA’s satellite data. This move aims to accelerate climate-related discoveries by making Earth science data more accessible for geospatial intelligence. The model, available on Hugging Face, promises groundbreaking potential for tasks such as flood and burn scar mapping, as well as deforestation tracking and greenhouse gas monitoring.
AI.com domain flips from ChatGPT to Elon Musk’s X.ai: AI.com, originally redirected to OpenAI’s ChatGPT interface, has apparently been acquired and now to Elon Musk’s X.ai.
China's Tencent says it is expanding testing of 'Hunyuan' AI model: Chinese tech giant Tencent Holdings has announced the expansion of testing for its self-developed AI model, “Hunyuan.” The AI model has been integrated with Tencent’s internal services and products, including Tencent Cloud and Tencent Docs. The move comes after Chinese regulators published interim rules on generative AI, allowing companies like Tencent to roll out AI-powered products once they obtain approvals.
Introducing AudioCraft: A Generative AI Tool For Audio and Music: AudioCraft from Meta allows users to easily generate high-quality audio and music from text prompts. Meta is open-sourcing MusicGen, AudioGen, and EnCodec for research proposes, enabling researchers and practitioners to train their own models with their own datasets. The AudioCraft family of models can produce high-quality audio with long-term consistency and is designed to simplify the process of generative models for audio.
The future of AI is video, and it’s coming at us fast: The future of AI is rapidly advancing towards video applications, sparking a mix of excitement and apprehension. Companies like Lightricks are showcasing next-gen video editing software capable of copying one video style to another. However, concerns arise over the potential misuse of AI-generated content, particularly in the context of elections and the spread of misinformation.
Blogging portal Medium bans AI content, saying it's "a home for human writing": The blogging platform Medium.com has implemented a policy requiring users to tag AI-generated content; otherwise, it will not be distributed. Medium’s content director, Scott Lamb, stated that the platform is a “home for human writing” and while AI-assisted writing is allowed, fully AI-generated text lacks human wisdom and is considered unfit for Medium. This move follows other content platforms, such as Arstation and Stackoverflow which have taken measures to limit or ban AI-generated content to maintain content quality and authenticity.
AI in Business and Investments
Investments in generative AI leads to rising costs for Apple, Tim Cook says 'we're investing a lot': Apple’s generative AI investments are pushing up costs, according to CEO Tim Cook, resulting in a 2% decline in Apple shares. Despite exceeding Q3 analyst predictions, the company foresees a sustained sales slump. While iPhone sales disappointed investors, strong performances in China and the services segment, including Apple TV, are helping to mitigate the impact. Apple remains optimistic about Q4 revenue performance, with growth in services, iPad, and Mac sales expected to drive results.
‘Every single’ Amazon team is working on generative AI, says CEO: Amazon CEO Andy Jassy made a significant announcement during the Q2 2023 earnings call, stating that “every single one” of Amazon’s businesses is actively engaged in multiple generative AI initiatives. Jassy highlighted the vital role AI plays across the company, from cost-effectiveness to enhancing customer experiences. The company’s infrastructure and AWS services support various generative AI applications. The tech giant might unveil AI-based improvements for Alexa at an upcoming event.
CoreWeave, which provides cloud infrastructure for AI training, secures $2.3B loan: CoreWeave, a GPU-focused cloud compute provider, secures a $2.3 billion debt financing led by existing investors. The company plans to use the funds to meet growing demand for AI training and general-purpose computing. CoreWeave’s strategic investments in specialized GPU cloud infrastructure have paid off attracting startups like Inflection AI, which trained its AI assistant product on CoreWeave’s infrastructure.
The Growing Conversational AI and Virtual Assistant Market: The conversational AI and virtual assistant market is projected to reach $18.6 billion in 2023, with a $16.2% growth rate, driven by the adoption of cloud-based contact services. Gartner predicts a 24% growth in the virtual assistant market. As AI continues to mature, it may replace traditional contact center platforms, leading organizations to improve customer service efficiency.
AI in Social Media and Entertainment
Lupe Fiasco teams up with Google to create AI rhyming tools for rappers: Lupe Fiasco, the rapper and academic, has collaborated with Google to create TextFX, an AI-powered writing toolkit for rappers. TextFX explores creative possibilities in text and language by using large language models, including few-shot learning. The tool prompts users to view words or phrases from different perspectives, allowing them to experiment with simile creation, word explosion, and scene generation. This collaboration highlights the potential of AI to boost human creativity and extends beyond creative writing to other domains. The toolkit is available for experimentation here and its code has been open-sourced.
Instagram is working on labels for AI-generated content: Instagram reportedly developing notices to identify AI-generated content on its platform. The feature would inform users when content has been “created or edited with AI.” This move comes after commitments by Meta and other major AI players to responsibly develop AI and combat misinformation. While the automated labeling system specifics are unclear, it may be proactive in some cases. Meta has been open-sourcing its AI models, but has yet to widely release generative AI features for Instagram.
Tinder tests AI photo selection feature to help users build profiles: Tinder is testing an AI photo selection feature to help users build dating profiles. The feature uses AI to analyze a user’s photo album and select the five best pictures that represent them accurately. The aim is to streamline profile creation and alleviate the challenge of choosing the right photos. Match Group, Tinder’s parent company, is also experimenting with other AI features, including leveraging AI to improve content relevancy.
AI in Environment and Science
Scientists buzzing with hopes for AI bee research: Researchers from the University of Edinburgh are using AI to identify threatened bees in the wild, aiming to conserve their populations. The team records and analyzes thousands of bee sounds, linking them with comprehensive details about the bees and their environment. The technology may eventually be used for “remote acoustic monitoring stations,” aiding conservation efforts by identifying concerning changes in bee populations automatically.
AI use in breast cancer screening as good as two radiologists, study finds: AI has been found to be safe and can reduce the workload of radiologists in breast cancer screening, according to a comprehensive trial. The study involved over 80,000 women in Sweden and found that AI-supported screening was as effective as two radiologists working together, while also reducing false positives and almost halving the radiologists’ workload. The final results are yet to be published, but the interim analysis suggests that AI in mammography screening is safe and has the potential to improve efficiency in breast cancer detection and diagnosis.
🎧 Did you know AI Breakfast has a podcast read by a human? Join AI Breakfast team member Luke (an actual AI researcher!) as he breaks down the week’s AI news, tools, and research: Listen here
5 new AI-powered tools from around the web
Superluminal is a conversational data interaction tool designed for data dashboards, offering context, synthesis, code generation, and prompt optimization for valuable insights. Developers can integrate an AI copilot using its customizable React component and API, with secure hosting and data encryption for data protection.
HeyPhoto offers an effortless way to fix faces in photos, with AI-powered tools like adjusting gaze direction, smile, gender, and age. Simply drag a slider to get stunning results with the original quality preserved. With user feedback, HeyPhoto continues to improve and expand its features for picture-perfect photos.
ProShots create professional headshots with AI. Select location, pose, and attire, and ProShots generates realistic headshots. No need for a photographer or expensive equipment. High-resolution images and batch generation available.
MarketAI revolutionizes video production with high-quality, captivating marketing videos for startups and enterprises. Up to 10x faster and 7x cheaper than traditional agencies, blending technology and expertise for exceptional results. Three easy steps: fill info & script, review the storyboard, and enjoy your video.
SkillAI empowers learners to unleash their potential by generating personalized learning paths for any skill using AI. With progress tracking, a free plan, and premium options, users can embark on a journey of growth and success.
arXiv is a free online library where scientists share their research papers before they are published. Here are the top AI papers for today.
DeepSpeed-Chat is a groundbreaking system developed by Microsoft that democratizes Reinforcement Learning with Human Feedback (RLHF) training for ChatGPT-like models. It offers three key capabilities: an easy-to-use training and inference experience, a DeepSpeed-RLHF pipeline that replicates the InstructGPT training process, and a robust DeepSpeed-RLHF system that combines various optimizations. The system enables training models with hundreds of billions of parameters in record time and at a fraction of the cost.
OpenFlamingo is an open-source framework for training large autoregressive vision-language models, ranging from 3B to 9B parameters. It replicates DeepMind’s Flamingo models and allows models to process interleaved sequences of images and text, enabling in-context learning. The models achieved an average performance of 80-89% compared to Flamingo on various vision-language datasets. OpenFlamingo’s architecture includes dense cross-attention modules between image and text tokens, and it utilizes publicly available components like CLIP as a vision encoder. By releasing the models and code, OpenFlamingo promotes research in autoregressive vision-language models, which were previously limited by closed-source proprietary alternatives.
The paper introduces Dynalang, an agent that learns to understand and utilize diverse languages to predict the future and interact with the environment. The key idea is that language helps agents make better predictions about future observations and rewards. The agent learns a multimodal world model to predict future text and image representations and learns to act through imagined model rollouts. Dynalang outperforms traditional agents in various tasks and can be pretrained on text and video data without actions or rewards. Despite some limitations, Dynalang shows promise in scaling to large web datasets and offers a versatile approach to language understanding in AI.
Watermarking for Large Language Models (LLMs) shows promise for identifying generated text. Existing methods suffer from biased statistical tests for false positive rates. This study introduces robust statistical tests, evaluates watermarks on natural language processing benchmarks, and explores advanced detection schemes. Multi-bit watermarking is proposed for user or model version identification. The research established watermarks as a reliable method to trace LLM outputs, holding great promise for identifying generated content and promoting responsible model usage.
The paper presents SelfCheck, a novel zero-shot checking scheme for LLMs to recognize errors in their own step-by-step reasoning without external resources. SelfCheck decomposes the checking task into stages, prompting the LLM to extract the target, collect relevant information, regenerate the step, and compare the results with the original step. The integration function combines step-checking results to provide a confidence score for the whole solution. Evaluations of math datasets show that SelfCheck significantly improves prediction accuracy and provides accurate confidence estimates for LLM solutions. It demonstrates the potential for LLMs to perform effective self-verification.
Thank you for reading today’s edition.
Your feedback is valuable.
Respond to this email and tell us how you think we could add more value to this newsletter.