checkAd

     113  0 Kommentare Floating-Point Arithmetic for AI Inference - Hit or Miss?

    NORTHAMPTON, MA / ACCESSWIRE / April 11, 2023 / Qualcomm: OnQ BlogArtificial intelligence (AI) has become pervasive in our lives, improving our phones, cars, homes, medical centers, and more. As currently structured, these models primarily run in …

    NORTHAMPTON, MA / ACCESSWIRE / April 11, 2023 / Qualcomm:

    Qualcomm, Tuesday, April 11, 2023, Press release picture
    Anzeige 
    Handeln Sie Ihre Einschätzung zu Qualcomm Inc!
    Long
    167,47€
    Basispreis
    1,19
    Ask
    × 14,24
    Hebel
    Short
    192,38€
    Basispreis
    1,27
    Ask
    × 13,14
    Hebel
    Präsentiert von

    Den Basisprospekt sowie die Endgültigen Bedingungen und die Basisinformationsblätter erhalten Sie bei Klick auf das Disclaimer Dokument. Beachten Sie auch die weiteren Hinweise zu dieser Werbung.

    OnQ Blog

    Artificial intelligence (AI) has become pervasive in our lives, improving our phones, cars, homes, medical centers, and more. As currently structured, these models primarily run in power-hungry, network-dependent data centers. Running AI on edge devices such as smartphones and PCs would improve reliability, latency, privacy, network bandwidth usage, and overall cost.

    To move AI workloads to devices, we need to make neural networks considerably more efficient. Qualcomm has been investing heavily in the tools to do so, most recently showcasing the world's first Stable Diffusion model on an Android phone. Bringing models like GPT, with its hundreds of billions of parameters, to devices will require even more work.

    The Qualcomm AI Research team has been making advances in deep learning model efficiency for the past years with state-of-the-art results in neural architecture search, compilation, conditional compute, and quantization. Quantization, which reduces the number of bits needed to represent information, is particularly important because it allows for the largest effective reduction of the weights and activations to improve power efficiency and performance while maintaining accuracy. It also helps enable use cases that run multiple AI models concurrently, which is relevant for industries such as mobile, XR, automotive, and more.

    Recently, a new 8-bit floating-point format (FP8) has been suggested for efficient deep-learning network training. As some layers in neural networks can be trained in FP8 as opposed to the incumbent FP16 and FP32 networks, this format would improve efficiency for training tremendously. However, the integer formats such as INT4 and INT8 have traditionally been used for inference, producing an optimal trade-off between network accuracy and efficiency.

    We investigate the differences between the FP8 and INT8 formats for efficient inference and conclude that the integer format is superior from a cost and performance perspective. We have also open sourced the code for our investigation for transparency

    Differences between floating point and integer quantization

    Our whitepaper compares the efficiency of floating point and integer quantization. For training, the floating-point formats FP16 and FP32 are commonly used as they have high enough accuracy, and no hyper-parameters. They mostly work out of the box, making them easy to use.

    Seite 1 von 4
    Der Analyst erwartet ein Kursziel von 175,22$, was einem Rückgang von -2,46% zum aktuellen Kurs entspricht. Mit diesen Produkten können Sie die Kurserwartungen des Analysten übertreffen.
    Übernehmen
    Für Ihre Einstellungen haben wir keine weiteren passenden Produkte gefunden.
    Bitte verändern Sie Kursziel, Zeitraum oder Emittent.
    Alternativ können Sie auch unsere Derivate-Suchen verwenden
    Knock-Out-Suche | Optionsschein-Suche | Zertifikate-Suche
    WerbungDisclaimer


    Diskutieren Sie über die enthaltenen Werte


    Accesswire
    0 Follower
    Autor folgen
    Mehr anzeigen
    We’re a newswire service standout and fast becoming an industry disruptor. We provide regional, national and global news to thousands of clients around the world. We’re also leading the way in social engagement, targeting and analytics.
    Mehr anzeigen

    Verfasst von Accesswire
    Floating-Point Arithmetic for AI Inference - Hit or Miss? NORTHAMPTON, MA / ACCESSWIRE / April 11, 2023 / Qualcomm: OnQ BlogArtificial intelligence (AI) has become pervasive in our lives, improving our phones, cars, homes, medical centers, and more. As currently structured, these models primarily run in …

    Schreibe Deinen Kommentar

    Disclaimer