Understanding Black-Box Algorithm Attacks on AI Systems

Best-of-N Jailbreaking: A Simple Black-Box Algorithm Attack on Cutting-Edge AI Systems

https://arxiv.org/abs/2412.03556

Understanding Black-Box Algorithm Attacks on AI Systems

Best-of-N (BoN) Jailbreaking is a simple black-box algorithm that can attack various cutting-edge AI systems across modalities. The algorithm works by repeatedly sampling and enhancing prompts (such as random shuffling or capitalization of text prompts) until a harmful response is triggered. Research has shown that BoN Jailbreaking achieves an attack success rate (ASR) of 89% and 78% on closed-source language models (like GPT-4o and Claude 3.5 Sonnet) when sampling 10,000 enhanced prompts. Additionally, it can bypass state-of-the-art open-source defenses, such as circuit breakers.

BoN Jailbreaking is not only applicable to text modalities but can also extend to visual language models (like GPT-4o) and audio language models (like Gemini 1.5 Pro) using modality-specific enhancement methods. As the number of samples increases, the attack success rate of BoN gradually improves, and the relationship between ASR and sample size exhibits a power-law behavior across different modalities. By combining with other black-box algorithms, the attack effectiveness of BoN can be further enhanced; for example, when combined with optimized prefix attacks, ASR can increase by 35%.

This study indicates that despite the powerful capabilities of language models, they are highly sensitive to minor changes in input, which attackers can exploit for cross-modal attacks.

Netizen codetrotter: Has anyone tried to attack “Gandalf” using this method? Especially in level 8.

Netizen retiredpapaya: I’ve never seen such a complex author contribution list.

Netizen albert_e: In layman’s terms, can this be called a “brute-force attack”?

Netizen impure: This reminds me of the paper from Apple, where they found that small changes in prompts could lead to huge differences in outcomes.

Netizen infaloda: I’ve never read such a complicated abstract.

Prioritize Using Throwaway Code Over Design Documents

https://softwaredoug.com/blog/2024/12/14/throwaway-prs-not-design-docs

In the software development process, we often imagine everything going according to plan: first writing design documents, then gradually implementing features through small Pull Requests (PRs), and finally having a clean and orderly Git history. However, this idealized process is often difficult to achieve. Once coding begins, the initial design document often becomes obsolete.

Software engineer Doug proposed a different design methodology – coding binges. The core steps of this method include:

  • Using draft PRs: Create a draft PR that you do not intend to merge for prototyping or proof of concept.
  • Getting early feedback: Allow team members to review the PR early and gather their thoughts on the approach.
  • Documenting methods: Record your design thoughts in the draft PR as a historical record.
  • Preparing to discard draft PRs: Be ready to discard the draft PR at any time for rapid iteration.
  • Submitting in phases: Gradually submit clean production-level PRs from the draft PR.
  • Gradually improving testing and robustness: Improve testing and robustness in each PR phase.

This method emphasizes the importance of rapid prototyping and early feedback rather than over-relying on design documents. PRs serve as a historical record, making them more practically relevant than outdated design documents. Additionally, this approach encourages team members to maintain an open mindset, willing to try various solutions and learn from them.

Netizen comments:

  • JasonPeacock: This is called prototyping, an important part of the design process, some call it “path exploration”.
  • ElatedOwl: Writing is very beneficial for exploring the problem space. Many times, I confidently think I understand a problem, but after starting to write, I discover new critical issues.
  • m1n1: We once had to build and release a temporary suboptimal version to meet a deadline, avoiding a costly contract renewal. This temporary version not only helped us take off on time but also revealed missing requirements in the original design during subsequent flights.
  • ChrisMarshallNY: The biggest problem is that no one reads design documents, even if employers ask for them. Another problem is that people treat prototypes as final code, forcing me to use them.
  • robertclaus: The types of feedback people give between code and design documents are very different. Design documents encourage “why” questions, while once the prototype works, those questions are hard to raise.
  • delichon: I once lost three months of code due to a hard drive failure, but rewriting it only took two weeks. The second time, I had a clearer direction and better code quality.
  • hcarvalhoalves: If you think a lot of coding is to find the best solution, you’ll soon be replaced by GPT. The real challenge is achieving consensus on what should be built.
  • jp57: Who would imagine software development is a clean and tidy process? Professors? In reality, software development is more like writing than building houses or bridges.
  • simonw: I love this process: recording design decisions through continuous comment threads instead of trying to formalize them in a document.
  • miscaccount: Design documents can help narrow down the number of prototypes, especially when exploring something entirely new. While showing is better than telling, design documents are easier for new employees to understand.

Editor’s note: This method not only improves development efficiency but also promotes communication and collaboration between teams. By rapid prototyping and early feedback, problems can be discovered and resolved more quickly, thus improving the overall quality of the project.

SmartHome – A Game That Lets You Experience Smart Home Hell

https://smarthome.steviep.xyz

Understanding Black-Box Algorithm Attacks on AI Systems

SmartHome is a free browser game developed by scyclow using pure JavaScript without any external libraries. This game combines various elements such as text adventure, point-and-click adventure, puzzle, escape room, art game, incremental game, cozy game, and role-playing game. If you enjoy these types of games, then SmartHome is definitely worth a try.

In the game, players need to complete various tasks in a simulated smart home environment, such as creating accounts, paying bills, and operating smart devices. However, the game’s design aims to simulate the complexities of real-life smart home systems, filled with various maddening details and challenges. For example, you need to remember and input phone numbers, passwords, and other information, which may frustrate some players.

Player feedback:

  • Netizen esperent: I played for a while on my phone, but I gave up because I had to remember and input too much information. The game experience would be better if text fields could be auto-filled.
  • Netizen kbrackbill: This game replicates the chaotic scenes in corporate environments very well. Although the design intention is to create frustration, it indeed makes the game quite unpleasant.
  • Netizen Groxx: This game is done so well that it’s sickening to the extreme. I suggest adding some random delays to make the search results feel more realistic.
  • Netizen j0hnyl: I had the privilege of playtesting before the game’s release. Although many players may give up early on, I strongly recommend playing for at least 10-15 minutes; you will discover many interesting details and Easter eggs.
  • Netizen UniverseHacker: I once stayed in an Airbnb fully equipped with smart home devices, and this game perfectly recreates that experience. I give it a negative 10 points.
  • Netizen neumann: This game is both painful and wonderful. I love this feeling, even if I can only last for 10 minutes.
  • Netizen brumar: I wish there was an option to “wait to die in the corner.” Although the game bored me, it was indeed interesting.
  • Netizen jasfi: I love the game’s style – minimalist and focused on language. I hope to see more similar games in the future, but without deliberately frustrating players.
  • Netizen codebje: I made enough money in the game to pay the entrance fee, but wasted too much time during the transfer process. The only good news is that once you start making money, it’s easy to accumulate a lot of wealth.

Editor’s note: SmartHome is not just a game; it’s an experience. It allows players to deeply understand the conveniences and troubles brought by technology by simulating the complexities and frustrations of smart home systems. Although the game design may annoy some players, it is precisely this unique experience that makes it a work worth trying. If you are interested in smart homes, take some time to experience it; you might have a different feeling.

Replacing My Son’s School Schedule App with an E-Ink Display

https://mfasold.net/blog/displaying-website-content-on-an-e-ink-display/

Understanding Black-Box Algorithm Attacks on AI Systems

Recently, I was looking for a way to improve the morning routine at home: checking my child’s school schedule and substitute plans daily. The schedule can be viewed through the school’s website or a mobile app called “VPmobil,” but due to strict parental controls on my son’s phone, this has become an unwelcome task. Therefore, I decided to upgrade the traditional paper schedule on the fridge to an e-ink display that automatically fetches and displays the latest schedule, including daily changes.

Project goals:

  • Low power consumption: The device should operate without direct power and have a battery life of at least a few weeks.
  • No disruption: Display information without interacting with digital devices to reduce distractions.

Hardware selection: To achieve this, I chose the Inkscape product line from Soldered, specifically the Inkplate 6COLOR, which features a 600×448 pixel 6-color display. This device has a built-in ESP32 microcontroller, supports Wi-Fi connectivity, battery charging, and real-time clock functionality, making it ideal for low-power applications. I purchased a complete set of equipment, including the mainboard, e-ink display, casing, and battery, for a total price of 169 euros.

Technical implementation:

  • Logging in to the website: Automate the login and navigation to the schedule page using Playwright.
  • Screenshot: Capture the relevant part of the schedule page and adjust the style to fit the e-ink display’s needs.
  • Image processing: Use the Pillow library to process the screenshot, adjusting colors and sizes to ensure compatibility with the screen.
  • Server-side application: Create a simple web server using Werkzeug to provide the processed image to the e-ink display device.
  • Device-side code: Write device-side code using Arduino library to periodically fetch images from the server and display them.

Results: Ultimately, I installed the e-ink display in my home’s hallway, and my son checks the schedule at least twice a day. The device operates very reliably and occasionally displays special circumstances, such as extracurricular activities, class groupings, classroom changes, etc. The battery life exceeds 8 weeks, requiring charging only at the end of the holiday.

Selected comments:

  • Netizen IshKebab: Arduino software is really terrible, but the Inkplate 10 hardware is indeed excellent.
  • Netizen modernerd: This project showcases the potential of placing high-quality natural screens on more surfaces, which may expand to other interactive or non-interactive uses in the future.
  • Netizen dagss: Parents’ restrictions on children’s phone usage are commendable; today’s teenagers are severely distracted by smartphones, affecting their creativity.
  • Netizen simon_acca: If someone wants similar functionality but doesn’t want to tinker, I recommend my friend’s product Inklay, which I’ve been using for over a year and am very satisfied with.
  • Netizen kleiba: Digital services in Germany are really lagging behind; I hope this situation improves.

Phi-4: Microsoft’s Latest Small Language Model Focused on Complex Reasoning

https://techcommunity.microsoft.com/blog/aiplatformblog/introducing-phi-4-microsoft%E2%80%99s-newest-small-language-model-specializing-in-comple/4357090

Microsoft recently released Phi-4, the latest member of the Phi series, featuring a small language model with 14 billion parameters, focusing on complex reasoning, particularly excelling in mathematics. Phi-4 not only performs excellently on mathematical competition problems but also achieves high-quality results in general language processing tasks.

Advantages of Phi-4:

  • High performance: Phi-4 outperforms similar and even larger models in mathematical reasoning, thanks to high-quality synthetic datasets, carefully selected organic data, and innovative techniques post-training.
  • Resource efficiency: Despite its smaller parameter count, Phi-4 can still provide high-precision results, making it suitable for deployment in resource-constrained environments.
  • Safe and responsible AI: Microsoft emphasizes the robust security features provided in Azure AI Foundry, helping users ensure quality and security when developing and deploying AI solutions.

Real-world application cases: Phi-4’s strength in solving mathematical problems has been fully demonstrated. For instance, it successfully solved multiple complex mathematical competition problems. Users can experience the powerful capabilities of Phi-4 on Azure AI Foundry.

Community feedback:

  • Netizen simonw: The most interesting part is that Phi-4 is trained using synthetic data, which is detailed in the technical report. While Microsoft has not officially released the model weights, unofficial versions are already available for download on Hugging Face.
  • Netizen thot_experiment: In terms of following prompts, Phi-4 is still not as good as Gemma2 27B. I am not very satisfied with the Phi series models.
  • Netizen xeckr: Phi-4’s performance exceeds its parameter count, which is impressive.
  • Netizen jsight: I am not very interested in the 14 billion parameter version of Phi-4; this scale is no longer small, and I doubt it will have a significant advantage over Llama 3 or Mistral in performance.

Editor’s note: The release of Phi-4 undoubtedly brings new hope to the field of small language models. Although some users have raised questions about its actual performance, it is undeniable that Phi-4’s breakthrough in mathematical reasoning provides a new direction for the future development of small language models. With more research and practical applications unfolding, we look forward to Phi-4 showcasing more potential in real-world applications.

Building a High-Performance LLM Inference Engine from Scratch

https://andrewkchan.dev/posts/yalm-assets/

Understanding Black-Box Algorithm Attacks on AI Systems

In today’s AI era, the application of large language models (LLMs) is becoming increasingly widespread, but their efficient inference implementation remains a challenge. Andrew Chan details how to build a high-performance LLM inference engine from scratch on his blog, with a particular focus on optimization methods using CUDA.

Key technical points:

  • Multi-threading utilization: Reasonably allocate threads to fully utilize GPU computing resources and avoid idle cores.
  • Matrix multiplication optimization: Use block-level parallelism and warp reduction techniques to improve the efficiency of matrix multiplication.
  • Memory access pattern optimization: Optimize memory access to reduce memory bandwidth bottlenecks and improve overall performance.
  • FP16 quantization: Quantize floating-point numbers from FP32 to FP16 to reduce memory usage and further enhance inference speed.

Implementation steps:

  • Initial implementation: Use a simple thread allocation method that achieves basic functionality but has low performance, even worse than the CPU backend.
  • Block-level parallelism: Each block processes a row, utilizing warp reduction to improve computational efficiency, raising performance to 51.7 tok/s.
  • Kernel fusion: Fuse multiple kernels together to reduce memory access times, further improving performance to 54.1 tok/s.
  • Memory access optimization: Optimize memory access patterns through shared memory and block-level transfer techniques, raising performance to 56.1 tok/s.
  • Long context optimization: Optimize for long context generation, especially in the attention mechanism, further enhancing performance.
  • FP16 quantization: Quantize KV cache from FP32 to FP16, achieving a final short context generation speed of 63.8 tok/s and a long context generation speed of 58.8 tok/s through manual prefetching and loop unrolling techniques.

Selected comments:

  • Netizen reasonableklout: Thank you very much for sharing; this article has been very helpful to me! I hope to see more similar deep technical articles in the future.
  • Netizen sakex: Great article! Next, I suggest exploring collective matrix multiplication and sharding techniques.
  • Netizen fancyfredbot: This article did not mention how to leverage tensor cores and wgmma instructions, which are crucial for achieving peak performance.
  • Netizen diego898: Is there similar content that can be implemented in Python? I hope to find a conceptually complete and concise tutorial.
  • Netizen saagarjha: Is __shfl_down no longer recommended for use because it may cause warp synchronization issues?
  • Netizen guerrilla: What prerequisites are needed for such projects? I studied ANN and backpropagation in college, but I am not familiar with many architectural terms.

Editor’s note: This article provides an accessible explanation of how to build a high-performance LLM inference engine from scratch, ideal for developers and technical enthusiasts interested in GPU optimization. With practical code examples and detailed performance analyses, readers can easily understand and apply these techniques. We hope to see more similar technical sharing in the future!

Cyphernetes: A Query Language for Kubernetes

https://cyphernet.es/

Cyphernetes is a query language designed specifically for Kubernetes, aiming to simplify the querying and manipulation of Kubernetes resources. As a complex container orchestration system, Kubernetes has a vast resource model and API, but existing query tools often perform poorly in large-scale clusters. Cyphernetes introduces a Cypher-like query language, similar to Neo4j, making it more intuitive and efficient to query Kubernetes resources.

Highlight features:

  • Graph query language: Cyphernetes uses Cypher syntax, a widely used query language in graph databases. This allows users to easily query and manipulate Kubernetes resources, especially those with complex dependencies.
  • High performance: Compared to traditional query tools (like k9s), Cyphernetes performs excellently in large-scale clusters without placing excessive burdens on the Kubernetes API server.
  • Visualization: Cyphernetes also provides functionality to visualize the Kubernetes object model, clarifying resource relationships.

Community feedback:

  • Netizen alpb: In years of production environments, I have never seen any tool querying kube-apiserver that can work stably in clusters with thousands of nodes. Even popular tools like k9s crash due to expensive queries (such as listing all pods). If you really need these querying capabilities, I suggest building your own data source (for example, using controllers to listen to objects and store data in an SQL database), which would be better in the long run.
  • Netizen danpalmer: While I don’t mind replacing jq/jsonpath with better tools, I don’t understand why this isn’t SQL? Cyphernetes supports nearly the same semantics as SQL, and SQL is already a mature query language. The project’s goal is to make Kubernetes easier to query, not to invent a new query language.
  • Netizen jeremya: I have always loved the Cypher language created by the Neo4j team for querying graph data. The relationships between Kubernetes API objects are very suitable for applying this type of query.
  • Netizen multani: I recommend using Steampipe for similar queries. Steampipe is essentially PostgreSQL that can query various APIs, including Kubernetes. It supports multi-cluster transparent querying, custom resource definitions, and caching mechanisms, making it very powerful and flexible.

Editor’s note: The launch of Cyphernetes undoubtedly brings new possibilities to the Kubernetes ecosystem. By using graph query language, users can more intuitively understand and operate complex Kubernetes resources. Although some community members have raised doubts about its performance and necessity, it is undeniable that Cyphernetes can provide more efficient solutions in specific scenarios. If you are looking for a new way to manage and query Kubernetes resources, give Cyphernetes a try and see if it meets your needs.

Leave a Comment