Build a Natural Language Processing Transformer from Scratch

In summary: Theory.I am not saying that theory is bad or unnecessary. What I am looking for is a numerical example.I see. So you want a numerical example of a QM algorithm, without learning the theory. There is no numerical example of a QM algorithm, without learning the theory.
  • #1
jonjacson
453
38
TL;DR Summary
I wonder if anybody knows how to build and train one from scratch or if there is any book, video, or website explaining it.
I have read that transformers are the key behind recent success in artificial intelligence but the problem is that it is quite opaque.

I wonder if anybody knows how to build and train one from scratch or if there is any book, video, or website explaining it.

Thanks
 
Technology news on Phys.org
  • #4
jonjacson said:
I have read that transformers are the key behind recent success in artificial intelligence but the problem is that it is quite opaque.
Then you need to understand the theory.

jonjacson said:
But I don't see a python implementation, just the theory.
You did not ask for python code.
Google: python code for NPL transformer

There will be more answers from others.
 
  • #5
jonjacson said:
But I don't see a python implementation, just the theory.
But you didn't ask for a Python implementation, you asked about building one from scratch!

If I wanted to find a Python machine learning algorithm related to [X] I would input "Tensorflow X" into a search engine. Have you tried this?
 
  • Informative
Likes berkeman
  • #6
Baluncore said:
Then you need to understand the theory.You did not ask for python code.
Google: python code for NPL transformer

There will be more answers from others.
I see answers but they use libraries like pytorch or tensorflow. I mean from scratch, pure python.

pbuk said:
But you didn't ask for a Python implementation, you asked about building one from scratch!

If I wanted to find a Python machine learning algorithm related to [X] I would input "Tensorflow X" into a search engine. Have you tried this?
I don't want to use libraries.
 
  • #7
jonjacson said:
I see answers but they use libraries like pytorch or tensorflow. I mean from scratch, pure python.
Even if you don't use libraries, looking at the source code for the libraries might be a good way of learning how these things are done in Python.

If searching the web doesn't turn up any Python implementations that don't use libraries, that's probably a clue that everyone else who has tried what you are trying has found it easier to use the well-tested implementations in the libraries than to try and roll their own.
 
  • #8
PeterDonis said:
Even if you don't use libraries, looking at the source code for the libraries might be a good way of learning how these things are done in Python.

If searching the web doesn't turn up any Python implementations that don't use libraries, that's probably a clue that everyone else who has tried what you are trying has found it easier to use the well-tested implementations in the libraries than to try and roll their own.

The problem is that this looks like a magic thing, I don't know why is it "hidden" behind the bogus language "deep learning", "encoder", "decoder", "tokeninez input embeeding", "multi head self attention", "layer normalization", "feed forward network", "residual connection".... and all that stuff.

At the end I guess this will be a whole bunch of vectors, matrices and operations on them.

Hopefully now you understand what I want to know.
 
  • #9
jonjacson said:
The problem is that this looks like a magic thing
That problem doesn't look to me like a "find Python code" problem. It looks to me like an "learn and understand the theory" problem, as @Baluncore has already pointed out.
 
  • Like
Likes russ_watters, pbuk, Tom.G and 1 other person
  • #10
jonjacson said:
The problem is that this looks like a magic thing, ...
“Any sufficiently advanced technology is indistinguishable from magic”.
Arthur C. Clarke's third law.
 
  • Like
Likes russ_watters
  • #11
Baluncore said:
“Any sufficiently advanced technology is indistinguishable from magic”.
Arthur C. Clarke's third law.

Nice, but still there is no basic example of this anywhere.
 
  • #12
jonjacson said:
Nice, but still there is no basic example of this anywhere.
It is only magic because you do not yet understand the theory. If you were given some version of the Python code, you would still not understand the theory. It would still be magic, and a danger to the uninitiated.
 
  • Like
Likes russ_watters, PeterDonis and pbuk
  • #13
jonjacson said:
The problem is that this looks like a magic thing, I don't know why is it "hidden" behind the bogus language "deep learning", "encoder", "decoder", "tokeninez input embeeding", "multi head self attention", "layer normalization", "feed forward network", "residual connection".... and all that stuff.
For the same reason that quantum mechanics is hidden behind the bogus language "complex projective space", "Hermitian operators", "Hamiltonians", "eigenstates", "superpositions" and all that stuff.

At the end this is just a whole bunch of vectors, matrices and operations on them.

jonjacson said:
Hopefully now you understand what I want to know.
Yes, you want to do QM without learning the theory. Good luck.

Edit: or is this the kind of thing you are looking for: https://habr.com/en/companies/ods/articles/708672/
 
  • #14
pbuk said:
For the same reason that quantum mechanics is hidden behind the bogus language "complex projective space", "Hermitian operators", "Hamiltonians", "eigenstates", "superpositions" and all that stuff.

At the end this is just a whole bunch of vectors, matrices and operations on them.Yes, you want to do QM without learning the theory. Good luck.

Edit: or is this the kind of thing you are looking for: https://habr.com/en/companies/ods/articles/708672/

I am not saying that theory is bad or unnecessary. What I am looking for is a numerical example.

Schrodinger equation is fine, but once you compute the orbitals of the hydrogen atom you get a better understanding.

I don't understand why it is bad to ask for numerical examples and numbers.

Your edit was great and it is what I was looking for, I add the link you have at the end of that article:

https://jalammar.github.io/illustrated-transformer/

And something I just found:

https://e2eml.school/transformers.html

I hope this helps anybody interested on this topic.

Thanks to all for your replies.

Edit:

This may be good too:

 
Last edited:

FAQ: Build a Natural Language Processing Transformer from Scratch

How difficult is it to build a Natural Language Processing Transformer from Scratch?

Building a Natural Language Processing Transformer from scratch can be quite challenging, as it requires a deep understanding of neural networks, attention mechanisms, and natural language processing concepts. It also involves working with large datasets and complex algorithms.

What are the key components of a Natural Language Processing Transformer?

The key components of a Natural Language Processing Transformer include the encoder-decoder architecture, self-attention mechanism, feedforward neural networks, and positional encoding. These components work together to process and generate language sequences.

Do I need a strong background in machine learning to build a Natural Language Processing Transformer from Scratch?

Yes, a strong background in machine learning is essential to successfully build a Natural Language Processing Transformer from scratch. You should have a good understanding of neural networks, deep learning, and natural language processing techniques to tackle this complex task.

How long does it take to build a Natural Language Processing Transformer from Scratch?

The time it takes to build a Natural Language Processing Transformer from scratch can vary depending on your level of expertise, the complexity of the model, and the size of the dataset you are working with. It could take several weeks to several months to complete the project.

What are some resources or tutorials available for building a Natural Language Processing Transformer from Scratch?

There are several resources and tutorials available online that can help you build a Natural Language Processing Transformer from scratch. Some popular resources include research papers, online courses, blog posts, and open-source libraries that provide code implementations and explanations of the transformer architecture.

Back
Top