ein0p 2 days ago

It is a very bad idea to handle the KV cache in Jax naively like that. Jax requires static shapes. You're creating dynamic shapes there, causing a ton of recompilation.

  • magicalhippo 2 days ago

    The blog mentions it's not for production use. This sounds like one thing you'd want to change.

    I was curious what else made it not fit for production. Anything fundamental or just minor issues like this?

  • bravura 2 days ago

    Is there any automatic way to get warned against these antipatterns?

    • sega_sai 2 days ago

      you can see each compilation if you use JAX_LOG_COMPILES variable or you use low enough logging level.

      • bravura 2 days ago

        Sorry, not to belabor this point.

        Would that suggest to you what you did wrong? Or purely show you what you got right? How chatty is this variable?

        • sega_sai 2 days ago

          I used this to see if something is repeatedly compiled. I.e. I have the code that runs in a loop and you immediately see if something is compiled only once, or every time. (and it produces a lot of output) I'm not saying this is the best way to do it though, it just worked for me.

  • YetAnotherNick 2 days ago

    Just don't use jit in generation and it would be fine. Of course there is some performance penalty but in my experience jit is oversold and the difference is something like ~10-30%.

    Also in any case to get optimized code you need flash attention and many other tricks.

heyitsguay 2 days ago

Unreadable in portrait mode on mobile. The text column is way too narrow, should be an easy fix!

  • kccqzy 2 days ago

    People had long forgotten that mobile browsers handle wide content by zooming. If you are making a website but don't bother optimizing it for mobile, leave off the viewport <meta> element.

  • abhgh 2 days ago

    It's not just the width of the column - there are annotations on certain lines (that appear on a right "margin") that don't show up on mobile. I think that makes it not an easy fix, but to your larger point, this is not very mobile friendly. It looks quite good on a desktop though.

ge96 2 days ago

"focuses on the soul of pure functional programming which makes it more cool"

This is tangential to this post's main point but if you're trying for mass adoption this can go badly. Case in point, a hardware company I backed decided to write their code using Haskel like why "because it's cool" and now the people who are trying to modify/work with it have to deal with Haskell vs. a general purpose language like C++ idk...

edit: I also realize most of this code is python but yeah

  • drdaeman 2 days ago

    > deal with Haskell vs. a general purpose language like C++

    What's the actual problem? Company decided to use Haskell (which is also a general-purpose language) then hired people who don't know it?

    If so, hire a bunch of Pythonistas to work on a Rails project and you'll have similar kind of struggles (and it won't mean that Python or Ruby are somehow bad, it'll be an almost entirely non-technical issue).

    • ge96 2 days ago

      the problem is it's intended to be an open source device so haskell would be harder to work on than something simpler like C++

      again my point is about adoption, hence offering multiple languages in most products like stripe for ex

      edit: it's alright, when they actually ship these things (after putting down $3.5K) I hope I will take it upon myself to port it to C++ myself

      edit: "general purpose" is probably the wrong way to put it, Haskell is harder to read than C++ is my pov

      • Hasnep a day ago

        If you know Haskell and don't know C++ then C++ will be harder to read. Haskell is definitely less widely used than C++, but that doesn't make it more complex.

        • ge96 a day ago

          Idk, they're different eg. imperative vs. declarative and that monad thing.

          Still... I'm working with someone who came from a Swift background and thinks JavaScript is hard so that goes against my thought.

brcmthrowaway 2 days ago

these anime kids are going to take everyones job

  • ge96 2 days ago

    Anya from spy family x

  • rfl890 2 days ago

    said no one ever

bravura 2 days ago

To the poster who wrote: "Hey Saurabh, will you be willing to teach me this on a call? I'm willing to pay for it (im not rich, so, dont expect much please). I will be having a lot of questions, mostly related to core concepts of transformers and jax in general."

This is the wrong way to ask for help.

Instead, consider offering your help and time apprenticing and learning along the way. Can't code that well? Write test cases and clean up. Or help blog writing. etc. You certainly have some valuable skill you could trade up.

  • saagarjha 2 days ago

    I mean I’m no Saurabh but that didn’t seem to unreasonable to me? In fact I’ll put my money where my mouth is and offer half an hour for free just to spite you