Aug. 6, 2025

213 - Setting up your own chatbot with Ruggiero Lovreglio and Amir Rafe

213 - Setting up your own chatbot with Ruggiero Lovreglio and Amir Rafe
The player is loading ...
213 - Setting up your own chatbot with Ruggiero Lovreglio and Amir Rafe

The AI revolution has arrived, but fire safety engineers face a critical dilemma: how to leverage powerful AI tools while protecting confidential project data. 

Professor Ruggiero Rino Lovreglio from Massey University and Dr. Amir Rafe from Utah State University join us to explore the world of local Large Language Models (LLMs) - AI systems you can run privately on your own computer without sending sensitive information to the cloud. While cloud-based AI like ChatGPT raises serious privacy concerns (as Sam Altman recently admitted, user prompts could be surrendered to courts if requested), local models offer a secure alternative that doesn't compromise confidentiality.

We break down things you should know about setting up your own AI assistant: from hardware requirements and model selection to fine-tuning for fire engineering tasks. Our guests explain how even models with "just" a few billion parameters can transform your workflow while keeping your data completely private. They share their groundbreaking work developing specialized fire engineering datasets and testing these tools on real-world evacuation problems.

The conversation demystifies technical concepts like parameters, temperature settings, RAG (Retrieval-Augmented Generation), and fine-tuning - making them accessible to engineers without computer science backgrounds. Most importantly, we address why fire engineering remains resilient to AI takeover (with only a 19% risk of automation) while exploring how these tools can enhance rather than replace human expertise.

Whether you're AI-curious or AI-skeptical, this episode provides practical insights for integrating these powerful tools into your engineering practice without compromising the confidentiality that defines professional work. Download Ollama today and take your first steps toward a more efficient, AI-augmented engineering workflow that keeps your data where it belongs - on your computer.

Further reading: https://ascelibrary.org/doi/abs/10.1061/9780784486191.034

Ollama: https://ollama.com/

Hugging face: https://huggingface.co/

Rino's Youtube with guide videos: https://www.youtube.com/@rinoandcaroline

----
The Fire Science Show is produced by the Fire Science Media in collaboration with OFR Consultants. Thank you to the podcast sponsor for their continuous support towards our mission.

00:00 - The AI Revolution in Fire Engineering

03:22 - Privacy Concerns with Cloud-Based AI

09:20 - Understanding AI Models and Parameters

17:42 - Why Fire Engineering Resists AI Takeover

25:45 - Local LLMs: Running AI Privately

35:49 - Fine-Tuning Models for Specialized Tasks

48:47 - Future Integration of AI in Engineering

58:07 - Final Thoughts and Episode Wrap-up

WEBVTT

00:00:00.541 --> 00:00:02.548
Hello everybody, welcome to the Fire Science Show.

00:00:02.548 --> 00:00:09.173
When the Chachapiti revolution occurred, I was very happy to tell you all about it as soon as I could.

00:00:09.173 --> 00:00:22.533
I had also an episode with Mike Kinsey where we've discussed the possibility of using tools like Chachapiti and setting out your own stuff that can support engineers' workflow.

00:00:22.533 --> 00:00:24.663
Fast forward a few years later.

00:00:24.663 --> 00:00:27.507
I think we got used to this technology by now.

00:00:27.507 --> 00:00:30.553
I think it's going to be the defining technology of this decade.

00:00:30.553 --> 00:00:42.072
You know, like Internet defined the 90s, I guess Facebook defined the 2000s, instagram and Twitter probably defined the 2010s.

00:00:42.072 --> 00:00:43.323
Maybe TikTok.

00:00:43.625 --> 00:00:50.567
I'm a different generation and I think this decade will be defined through large language models, chatbots, etc.

00:00:50.567 --> 00:00:54.293
And it's just a part of our lives nowadays.

00:00:54.293 --> 00:00:59.073
But are you really using that in your engineering workflow?

00:00:59.073 --> 00:01:00.639
I'm using it for programming.

00:01:00.639 --> 00:01:04.013
I'm using it to solve pieces of problems that I work.

00:01:04.013 --> 00:01:22.121
I find finding solutions to some issues much quicker with the support of chatbots, but it's not that I am really incorporating them in my workflows completely, and the real problem with them is really the privacy problem.

00:01:22.121 --> 00:01:28.313
And hallucinations, yes, but privacy would be the one that worries me the most.

00:01:28.799 --> 00:01:44.531
A day or two days ago, I saw a quote from Sam Altman when he was asked what's going to happen if a court asks OpenAI to release prompts of a user in some sort of court hearing, and Sam said that they probably will have to give that to the court.

00:01:44.531 --> 00:01:53.840
So if you are talking with ChatGPT, any kind of LLM it's not that you're having a secure conversation with your computer.

00:01:53.840 --> 00:02:00.281
You're sending all of that into the internet and if you upload a file, it goes somewhere.

00:02:00.281 --> 00:02:04.569
If you upload a confidential file, well, it goes somewhere as well.

00:02:04.569 --> 00:02:12.653
So you probably don't want to do that, and that kind of limits the ability for us to work, because most of the stuff we have here is confidential.

00:02:12.653 --> 00:02:16.751
The amount of NDAs I have to sign to do anything is crazy.

00:02:16.751 --> 00:02:22.473
Therefore, the ability to use AI in my engineering workflow is limited.

00:02:22.800 --> 00:02:34.271
And here comes the solution my two days guests, professor Ruggiero Lavreglio from Massey University and Dr Amir Rafae from Utah State University.

00:02:34.271 --> 00:02:47.911
They've been playing with this technology, but they were playing with LLMs, or small language models that you can install locally on your computers and you have the ownership of the data that is being sent.

00:02:47.911 --> 00:02:50.271
You don't even need internet for them to work.

00:02:50.271 --> 00:02:51.680
Quite magical world.

00:02:51.680 --> 00:03:00.889
Instead of relying on insane computational power of OpenAI or XAI, you can use your own computer to be your own.

00:03:00.889 --> 00:03:02.944
Chat Comes with the requirements.

00:03:02.944 --> 00:03:03.829
Not that easy.

00:03:03.829 --> 00:03:09.372
Well, it technically is easy, but it has its challenges, which you will learn in the episode.

00:03:09.372 --> 00:03:15.713
So I think this opens a new pathway where those tools can be really, really useful for fire safety engineering.

00:03:16.116 --> 00:03:23.469
Enough of my rambling, because there's a lot of valuable content behind the intro, so let's spin the intro and jump into the episode.

00:03:23.469 --> 00:03:29.719
Welcome to the firesize show.

00:03:29.719 --> 00:03:33.203
My name is Wojciech Wigrzynski and I will be your host.

00:03:49.193 --> 00:04:02.659
The FireSense Show is into its third year of continued support from its sponsor, ofar Consultants, who are an independent, multi-award winning fire engineering consultancy with a reputation for delivering innovative safety-driven solutions.

00:04:02.659 --> 00:04:16.420
As the UK-leading independent fire risk consultancy, ofr's globally established team have developed a reputation for preeminent fire engineering expertise, with colleagues working across the world to help protect people, property and the plant.

00:04:16.420 --> 00:04:32.548
Established in the UK in 2016 as a startup business by two highly experienced fire engineering consultants, the business continues to grow at a phenomenal rate, with offices across the country in eight locations, from Edinburgh to Bath, and plans for future expansions.

00:04:32.548 --> 00:04:40.752
If you're keen to find out more or join OFR Consultants during this exciting period of growth, visit their website at ofrconsultantscom.

00:04:40.752 --> 00:04:43.201
And now back to the episode.

00:04:43.201 --> 00:04:47.425
Hello everybody, welcome in the Fire Science Show around the globe.

00:04:47.425 --> 00:04:56.175
Today I'm in my studio in Warsaw, my first guest, dr Amir Rafay from Utah State University.

00:04:56.175 --> 00:04:58.016
Hey, amir, nice to see you.

00:04:58.360 --> 00:05:11.687
Hello, thank you for welcoming me, thank you Good afternoon, I guess, and my second guest, professor Ruggiero Lovreglio from Massey University.

00:05:11.687 --> 00:05:14.970
Hey, rino, good to see you Good morning everyone.

00:05:14.970 --> 00:05:17.086
Good morning in New Zealand.

00:05:17.105 --> 00:05:24.750
Wow, that's literally around the globe, nice we are in the future here and can tell you that the weather looks good.

00:05:31.540 --> 00:05:32.521
I'm so glad that tomorrow looks nice.

00:05:32.521 --> 00:05:33.785
Thank you, that's what I was looking for.

00:05:33.785 --> 00:05:38.182
It's it's a very late evening in warsaw, um amir, congratulations on passing your viva, and I just I heard that it was a few days ago.

00:05:38.182 --> 00:05:43.211
So all for a good start into the episode and it's a very interesting content.

00:05:43.211 --> 00:05:48.846
We are talking about ai and how ai will change the industry.

00:05:48.846 --> 00:06:02.302
I remember, I think two years ago I was talking with mike kinsey on podcasts about creating some sort of ai tools or ai alike tools, because michael had some explicit tools that were not really in ai.

00:06:02.302 --> 00:06:07.754
He also had ai, to be honest, felt like, you know, a dream for a future.

00:06:07.754 --> 00:06:09.146
It was very interesting.

00:06:09.146 --> 00:06:13.211
Today, two years later, holy crap, a lot has changed.

00:06:13.211 --> 00:06:18.771
Rino, can you summarize where we are in this madness of AI revolution today?

00:06:19.300 --> 00:06:45.314
So yeah, we can say that three years ago we all had the shakeup when we tried the GPT thing I think it was back then and we started typing and we started seeing oh gosh, it's answering questions, it's looking like a human, it's doing stuff that we're not expecting, and that was the big shock that the food world had with OpenAI and their first public tool for all of us.

00:06:45.314 --> 00:06:48.009
From there, things has been going wild.

00:06:48.009 --> 00:06:52.170
You can see that there is a lot more competition on cloud service.

00:06:52.170 --> 00:07:01.985
I experienced myself among them using Cloned Rock JV9, and they are really there trying to fly with each other.

00:07:01.985 --> 00:07:12.089
Who is going to have the best results with benchmarking some of them cheating, because they train the model on the benchmark and then they say, oh look, they're keeping a really good match.

00:07:12.089 --> 00:07:13.865
It's like, of course.

00:07:13.865 --> 00:07:21.461
So there is a lot at stake, especially who is going to be the one still leading forward.

00:07:21.903 --> 00:07:28.067
You've been talking, I don't know, for a year about Chachi PT and it's always the new one is always about to come.

00:07:28.067 --> 00:07:35.572
God knows when it's going to come, but we could see already, from the GTT 3 to 4, the great advancement.

00:07:35.572 --> 00:07:58.526
The latest news in the last couple of weeks was the release of an agent function within Chachi PT, which was like wow for the words Not much wow for me, amir, because if you are in the field and you see all the open tools that are in there, it's that prototyping a lot of stuff yourself before ChachPT or wherever produced those tools.

00:07:58.526 --> 00:08:02.425
So it was like, yeah, nice, let's give it a try, let's see how it works.

00:08:02.425 --> 00:08:08.293
And so now everyone has the buzzword agentic, ai, agentic, ai, goodness.

00:08:08.579 --> 00:08:19.312
And if you see what was Chachapiti one year ago, he had the possibility to be an agent because he was loading a Python environment, writing the code for you, developing the charts.

00:08:19.312 --> 00:08:30.516
And if I tell my wife, it's not the language model that developed the charts in Chachapiti, it's because he's having an agency on the Python code do stuff and give back the results to you.

00:08:30.516 --> 00:08:32.988
So agentic is not the new.

00:08:32.988 --> 00:08:40.014
It's a nice buzzword to do marketing, to sell fluff, but probably it's already two, three years old stuff.

00:08:40.014 --> 00:08:42.206
Amir probably can tell us more about it.

00:08:42.880 --> 00:08:44.368
Yeah, I'm very happy to hear that.

00:08:44.368 --> 00:08:51.529
If we could just quickly round up what are the popular models you mentioned ChatGPT, cloud, grok, gemini.

00:08:51.529 --> 00:08:54.149
There's also Perplexity, if I'm not wrong.

00:08:54.480 --> 00:08:59.533
Yeah, I'm not even mentioning Perplexity, copilot, because they are not models.

00:08:59.533 --> 00:09:04.110
They are just AI tools that use as a backbone these big models.

00:09:04.110 --> 00:09:16.787
They are just capable those platform, to reuse something that is already there through API and sell you something that is a bit more customized for specific task, and that's the direction we are taking.

00:09:16.787 --> 00:09:20.855
Also, profile protection engineering and the Chinese one what was his name?

00:09:20.855 --> 00:09:26.211
Ah yeah, deepcq was like there was another shaker for the wars.

00:09:26.211 --> 00:09:39.972
It's because of the results, because he was pretty cool with the faking capability, but also because they realized that they spend Chris that's the official data much less money than everyone else to train a model.

00:09:39.972 --> 00:09:46.532
And they were like, oh my goodness, and in China there is a lot of ban and difficulties to find the advanced graphic cards.

00:09:47.000 --> 00:09:48.725
But underneath the clothes.

00:09:48.725 --> 00:09:52.452
It's all instances of very similar concept.

00:09:52.452 --> 00:09:54.426
It was called Lama, I believe.

00:09:54.426 --> 00:09:54.947
I'm not wrong.

00:09:54.947 --> 00:10:01.273
But, Amir, tell us more about where we are from a technical point of view and how the environment looks right now.

00:10:01.633 --> 00:10:21.328
Sure, first I wanted to say we had AI, I think from 1940 and after that in 1950, when Alan Turing said and designed a test for how machines work as a human, or how they think as a human.

00:10:21.328 --> 00:10:30.687
And I think now, in 2025, we are working on that because we are looking for the AGI artificial general intelligence.

00:10:30.687 --> 00:10:42.086
But in the product side we have a lot of AI models or, as we can say, we have a lot of large language models or small language models.

00:10:42.086 --> 00:10:50.192
They can do a lot of things in the space of the fire, engineering or transportation.

00:10:50.192 --> 00:10:56.493
If I wanted to call it just one part of the Reno, what's the perplexity?

00:10:56.493 --> 00:11:00.990
It has a new large language models, as they called Sonar.

00:11:00.990 --> 00:11:06.230
So they have a Sonar reasoning model for the thinking model.

00:11:06.552 --> 00:11:24.274
But I think OpenAI when published the chat GPT and GPT models after the GPT 2.5 and after that 3.5, I think in 2023, if I'm correct.

00:11:24.274 --> 00:11:36.191
So they changed a lot of things because you know we worked with them for the usual tasks and you know the chat GPT was a generated AI.

00:11:36.191 --> 00:11:47.085
It's so important for us because we can communicate with these models and we can ask and after that, you, after that, the AI agent develops and a lot of things.

00:11:47.085 --> 00:12:00.542
So, after chat, gpt the meta company as we know it for the Facebook or WhatsApp or Instant Run, they publish an open source model as a LAMA.

00:12:00.542 --> 00:12:10.500
They change a lot of things because now we found we can work with the open source models, we can use the API for our tools.

00:12:10.500 --> 00:12:26.347
It's so important for us Now, in 2025, we have a lot of models, open source models, the Lama or Lama, and we can use a lot of API using the open router or similar products like this.

00:12:27.139 --> 00:12:43.914
So I think the open source models that we have on the Hugging Face and, you know, olamo, it's so important for us as a researcher or as engineers to create products for, you know, create a pipeline for our works.

00:12:43.914 --> 00:12:47.179
Create products, create a pipeline for our works.

00:12:47.179 --> 00:12:55.687
So I think it's a good start to talk about how we can use AI in our research or in our field, by engineering.

00:12:55.687 --> 00:12:55.947
That.

00:12:56.481 --> 00:13:03.049
It's an ongoing discussion and it's a discussion in the practical aspect already, because everyone is using AI in one way or another.

00:13:03.049 --> 00:13:07.869
Even if you're doing Google search today, you are using some sort of AI already in it.

00:13:07.869 --> 00:13:36.902
One thing that was with us from the start of this AI GPT-fueled revolution with the first release of the chatbot, was how deep they can go, how good answers can they make, and immediate problem that we've observed was that, with great confidence, they will give you a really bullshit answer, sometimes like blatantly wrong, with 100% confidence that you're right At this point.

00:13:36.902 --> 00:13:40.988
For me as a user of these tools, this is absolutely frustrating.

00:13:41.379 --> 00:13:45.250
Not sure if you've seen a meme of an AI surgeon.

00:13:45.250 --> 00:13:48.485
Is that I've removed part of your body or shouldn't it be on the other side?

00:13:48.485 --> 00:13:51.402
Oh, yes, you are right, it should have been on the other side.

00:13:51.402 --> 00:13:52.586
They let me remove it again.

00:13:52.586 --> 00:13:54.556
That kind of summarized this experience to me.

00:13:54.556 --> 00:13:57.804
Has anything changed in this hallucination aspect?

00:13:57.804 --> 00:14:00.110
Have things have improved, changed?

00:14:00.110 --> 00:14:02.943
I have a feeling they like sinusoidal curve.

00:14:02.943 --> 00:14:08.028
They get better and worse, better and worse, and I can't tell which part of the cycle we're in.

00:14:08.681 --> 00:14:18.572
Yeah, no, that hallucination has been one of the major things that people have been complaining about, especially when you are trying to get something with a reference.

00:14:18.572 --> 00:14:23.129
You get these beautiful title papers and then you go there and you don't find it.

00:14:23.129 --> 00:14:24.230
Can we prevent it?

00:14:24.230 --> 00:14:25.153
Definitely yes.

00:14:25.153 --> 00:14:35.639
The problem is that when we use really general models like ChachiDT have been trained to be good at answering a bit of everything and not actually report when the knowledge is not there.

00:14:35.639 --> 00:14:53.109
But I've been learning a lot from Amir that using a system prompt or using any other setting or parameter on the model like 180 quantum to meter is the temperature, that is the level of the model you can reuse it and make the model more deterministic.

00:14:53.109 --> 00:14:56.710
This stack more or work is the knowledge as we train.

00:14:56.710 --> 00:15:14.551
So I always show an example that when I use even small model like 3 million parameter, to give you a reference point, chachapy T4 is 1.7 trillion parameter, gtt 3.0 boards 10 times smaller, 170 million.

00:15:14.551 --> 00:15:17.947
And now you can run on your computer.

00:15:17.947 --> 00:15:23.710
Consider like one gigabytes of RAR will allow you roughly a bit more than 1 million parameters.

00:15:23.710 --> 00:15:28.940
So we can run more liquid language model on our own personal PC locally.

00:15:28.940 --> 00:15:30.886
So you unplug all the internet.

00:15:30.947 --> 00:15:35.029
It's still working and it starts asking things about who is Reno?

00:15:35.029 --> 00:15:37.769
Of course it's too small that he doesn't know me.

00:15:37.769 --> 00:15:51.888
He might know a lot about Isaac Newton and if I don't do any change in the setting, he's going to stop most likely telling me that I'm either a Macchia boss or I'm a musician from Naples I don't know which one is the worst, just kidding.

00:15:51.888 --> 00:15:55.289
And it's going to start telling me to fabricate a lot of stuff.

00:15:55.289 --> 00:16:06.221
But if I put then a specific prompt to say just stay on the facts, don't make hypotheses or make that work in the state then and reduce the temperature on the model, it will.

00:16:06.221 --> 00:16:09.990
The same model will tell you I don't have information about this context.

00:16:09.990 --> 00:16:11.927
Please ask something else.

00:16:11.927 --> 00:16:14.908
So it's something that can be modified.

00:16:15.539 --> 00:16:18.830
You have hallucination when the model doesn't have information about that.

00:16:18.830 --> 00:16:23.631
If you provide author with all the information that he needs, he will generate an answer.

00:16:23.631 --> 00:16:26.369
So that's the reason behind the hallucination.

00:16:26.369 --> 00:16:27.947
And they are probabilistic model.

00:16:27.947 --> 00:16:32.407
They don't even understand, they don't have consciousness, so they have a line here.

00:16:32.407 --> 00:16:33.341
You can tell them.

00:16:33.341 --> 00:16:34.888
Hey, you're making some assumption.

00:16:34.888 --> 00:16:35.509
Put it forward.

00:16:35.940 --> 00:16:44.123
The problem with the big models that we use through crowds is you don't have access to all these parameters, you don't have the possibility to use system prompts.

00:16:44.123 --> 00:16:51.835
You can do some clicking on Gemini when you use it with the Google AI Studio, but most of the others it's all locked.

00:16:51.835 --> 00:16:58.414
You don't even know what is the system prompt that the opening AI is using.

00:16:58.414 --> 00:17:00.385
So it's there, you can't touch it, and that's a big limitation.

00:17:00.385 --> 00:17:20.057
Hence much better using those models with ETH or using open source tools and, like Mir will say, just Google Ollama, and you will see that there are so many open solutions that are there for you for free, with model data nearly close to 1 trillion parameters, and so you can even download them.

00:17:20.057 --> 00:17:28.667
But good luck, maybe a computer that can run them, a cluster that can run them, because they are really GPU intense, and that's the other things that we can discuss later.

00:17:29.402 --> 00:17:31.008
Ceo, check your running piece model.

00:17:31.259 --> 00:17:43.009
I want to ask Amir on some of the terminology you have used, because I think for our listeners, the ones who are not that technologically savvy, let's try to clean some concepts.

00:17:43.009 --> 00:17:49.023
You've mentioned parameters, you've mentioned temperature, you've mentioned API tools.

00:17:49.023 --> 00:17:51.865
I would like to go over in more or less this order.

00:17:51.865 --> 00:17:59.904
So perhaps, amir, if Reno is telling me one model is like 3 billion parameters, other is trillion parameters, what does it mean?

00:17:59.904 --> 00:18:01.546
What was the parameter in this context?

00:18:01.880 --> 00:18:10.559
It's a very good question because it's a big start to working with the AI models in the product sections Yoda.

00:18:10.559 --> 00:18:20.490
When we call this GPT or Generative Between Transformers, they're created based on some data and after that they work for some tasks.

00:18:20.490 --> 00:18:33.034
So when we call these as a parameter, is large language model created based on some data, or textual data or image, or depends to which model that we are working with that.

00:18:33.661 --> 00:18:45.355
And with a different concept when we are talking about the data set created based on reinforcement learning, so it's different from other aspects that other models created.

00:18:45.676 --> 00:18:51.188
So parameters are the size of the data that model created based on that.

00:18:51.188 --> 00:19:05.215
When we are calling this model, for example, gemma from the Google, it has 4 billion parameters, so it's created based on the data that has a size of 4 billion.

00:19:05.215 --> 00:19:15.160
For example, when you call it the text data, pdfs or books, the size of them is 4 billion flowers.

00:19:15.160 --> 00:19:24.790
So because we have technical terms here as a context window, so a model can read a lot of data.

00:19:24.790 --> 00:19:30.063
Text window, so model can read a lot of data.

00:19:30.063 --> 00:19:31.849
It can read, for example, based on the context window that would read it, the model.

00:19:31.869 --> 00:19:33.054
One more follow-up question, if I can.

00:19:33.054 --> 00:19:55.007
But the simple fact that one model has more parameters is not necessarily meaning that it's a better model, because you would probably, if you are going for a specific use, you would probably, if you are going for a specific use, you would probably have a better, fine-tuned model on less parameters which are very fit to what you're trying to accomplish, rather than a multi-billion parameter model.

00:19:55.007 --> 00:20:12.133
Train on on random stuff and I guess when they, when they trained the next instances of grok or chat g, they're probably just let it read the entire internet as a training set as Amir was saying on this is like yeah, the bigger model is like the brain is much bigger.

00:20:12.880 --> 00:20:15.429
It comes with a much bigger context length.

00:20:15.429 --> 00:20:26.211
The context length is basically the short-term memory of a language model, so it's where you put all the information you want from turn, all the things that you want to digest and process.

00:20:26.211 --> 00:20:29.347
Based on that, the long-term memory, it's desperate.

00:20:29.347 --> 00:20:34.208
What has been the model trade-on can be changed, like we can do some fine-tuning on that.

00:20:34.208 --> 00:20:43.319
So that's why really big models are really good, because the reasoning capability improved and also the amount of information that can be worked on is much better.

00:20:43.883 --> 00:20:56.251
And now, if you could explain that concept of decreasing temperature and those parameter settings because, as like Rino said, you don't really play with those at all, with your normal chatbots.

00:20:56.251 --> 00:21:00.817
So what do you mean by altering the parameters of the model?

00:21:00.938 --> 00:21:04.145
as a user, I think just about your question.

00:21:04.145 --> 00:21:06.952
Everything it depends to your task.

00:21:06.952 --> 00:21:24.951
Maybe in some tasks in general tasks larger models, for example, where we have 170 billion model it works better in general tasks because they are more complex and they can answer you with more accuracy.

00:21:24.951 --> 00:21:45.453
But when you are working in the engineering side, when you're working when you wanted to connect the AI to the documents or some specific task, maybe a smaller model works better because you wanted to create the specific brain for the AI in the specific area.

00:21:45.453 --> 00:21:47.046
So it depends on your task.

00:21:47.480 --> 00:21:58.173
If you wanted to ask about the weather, if you wanted to ask about scheduling something, I think the larger model works better than the smaller one.

00:21:58.173 --> 00:22:05.205
So I want to just mention another thing about the temperature, as I forgot to say.

00:22:05.205 --> 00:22:24.368
Temperature is so important when you are using the API for that the creativity level of the model so it's so important when you are working with the RAG structure or when you wanted to extract the specific data from the document.

00:22:24.368 --> 00:22:36.272
It's important to set the temperature to zero or less than 0.5, because the temperature scale for the AI model is between zero to one.

00:22:36.272 --> 00:22:57.051
So when we decrease this and move from one to zero, we can decrease the creativity of the model and you know it's a kind of that we say to the model use our data for answering, not your brain or the data that you created based on that.

00:22:57.051 --> 00:23:02.606
So temperature in the product side is so important, choosing the model is so important.

00:23:02.980 --> 00:23:13.272
If a man is like drinking a bit, you will see that if you drink a bit more you might become more social, more chucky and tell us things that probably you don't even need.

00:23:13.272 --> 00:23:27.094
And when you get too much of the shots then you start telling a bit too much random stuff and almost you need the model to go and be flexible on what it says.

00:23:27.094 --> 00:23:33.773
And if the temperature is too low then you start becoming a bit boring and not able to come out with something.

00:23:34.480 --> 00:23:41.107
I once spent an entire evening talking in German and I don't know German, so that must have been a very high temperature.

00:23:41.107 --> 00:23:46.588
It's possible to recreate this in some experimental setting at some conference if you want.

00:23:46.588 --> 00:23:54.632
I guess this is also the reason why sometimes chatbot annoys me with the language it's using, like those ridiculous, ridiculous, you know texts.

00:23:54.632 --> 00:24:02.030
You can immediately say, oh, this is chatbot, generated like no one speaks like that, and probably when you go closer to zero, it just gives you more dense, simple answers, more to the point.

00:24:02.030 --> 00:24:04.980
But again, if you want something very creative, it just gives you more dense, simple answers, more to the point.

00:24:04.980 --> 00:24:10.412
But again, if you want something very creative, it's probably good to be higher.

00:24:10.412 --> 00:24:13.607
You've again used API, so not everyone knows what's API.

00:24:13.607 --> 00:24:14.611
So what's API?

00:24:15.079 --> 00:24:15.844
API it's.

00:24:15.844 --> 00:24:17.270
You know if it's simple?

00:24:17.270 --> 00:24:22.589
Because you know, I'm not a computer scientist, I'm just a user of the language in AI.

00:24:23.300 --> 00:24:24.083
I don't think any.

00:24:24.083 --> 00:24:28.432
Maybe there are a few listeners to Funnel Science Show who are computer scientists.

00:24:28.432 --> 00:24:30.728
So it's from one fire engineer to another.

00:24:33.202 --> 00:24:33.943
You know the.

00:24:34.224 --> 00:24:37.133
API is application programming interface.

00:24:37.133 --> 00:24:47.595
So you can use the API keys for using the AI in your code and calling the large language model from the server.

00:24:47.595 --> 00:24:58.373
If you are using the commercial large language model, you can call them from the server, or if you are using the open source model, you can call this from the hugging phase using the API.

00:24:59.160 --> 00:25:09.490
So an example would be I would be writing my code and instead of my code solving something would be I would be writing my code and instead of my code solving something, I could just write a, a piece of code, a piece of code.

00:25:09.490 --> 00:25:15.708
Go ask this to chat gpt and post me the answer, more or less yes, yes okay good what you need to do.

00:25:15.788 --> 00:25:17.491
You can even try with open ai.

00:25:17.813 --> 00:25:35.729
You can go in their api platform, you can generate a key and you can see that once you have that key, you can put it in many other user interface that allow you to use API or ChachiPT or any other model, and then you can run it directly on this new user interface.

00:25:35.729 --> 00:25:41.083
And the good news about that is that you don't need to have a subscription.

00:25:41.083 --> 00:25:42.125
You pay as you go.

00:25:42.125 --> 00:25:46.325
In fact, you can see how much every model costs in terms of token.

00:25:46.325 --> 00:25:48.428
That's the other keywords that we need to talk.

00:25:48.428 --> 00:26:05.898
Everything is running token, A set of characters that when you write a prompt it can be converted in a number of tokens that you send back to the server and the server comes back to you with an answer that is measured and broken and you pay the bills as you go.

00:26:05.898 --> 00:26:19.958
So if you have a really big company and you don't want to have a 300-substructure possibility, if some of your stuff don't use Chachapiti, AIM and C-Pro single instance would be probably cheaper to use it to.

00:26:19.958 --> 00:26:21.711
Ai were being probably cheaper from using the tool-aid AI.

00:26:22.153 --> 00:26:28.631
Okay, I mean it's a valid question for very generic use of AI, which happens seldom.

00:26:28.631 --> 00:26:32.391
You use it, let's say once a week or twice a week.

00:26:32.904 --> 00:26:33.907
Is it even a word to?

00:26:33.949 --> 00:26:35.051
go for it's much cheaper.

00:26:35.313 --> 00:26:42.192
Okay, good Guys, let's bring this discussion closer to fire safety engineering, because that's what I really wanted to know.

00:26:42.192 --> 00:26:46.942
I mean, it's fascinating to observe the AI revolution happening in front of our eyes.

00:26:46.942 --> 00:26:48.166
It's absolutely crazy.

00:26:48.166 --> 00:26:52.500
But well, let's get it closer to fire safety engineering.

00:26:52.500 --> 00:27:10.347
Before we started talking, I went to one of my favorite websites, willrobotstakemyjobcom, and this website actually has fire prevention and protection engineers in it, and it gives me minimal risk.

00:27:10.347 --> 00:27:19.106
So it tells me that there's a risk of 19% that AI will overtake fire safety engineers.

00:27:19.106 --> 00:27:27.789
It also tells me an average wage of fire safety engineer is $103,000 a year, which is very reassuring to me.

00:27:27.789 --> 00:27:30.256
What's so hard about fire safety engineering?

00:27:30.256 --> 00:27:34.310
That we are at minimal risk of being taken over by AI?

00:27:34.811 --> 00:27:40.872
No, this is a partial answer because we can accelerate a lot the work of fire protection engineer.

00:27:40.872 --> 00:27:46.345
That means a firm will need the capability, will have the capability to run more projects.

00:27:46.345 --> 00:27:59.315
That means that competition is going to be higher and probably more need fewer engineers, but also capable to do in a company to augment the staff using AI.

00:27:59.315 --> 00:28:11.994
It's helping to speed up a lot of the work, make more informed decisions, to have much more context when you make some decisions, but you still need the brain of humans to make the call.

00:28:12.486 --> 00:28:26.612
While I agree, it's also, I would rather say that, with ever-increasing workload, it's just going to allow us to catch up rather than decrease the number of fire engineers needed, which is also a very positive observation, but still still, you know, 19%.

00:28:26.612 --> 00:28:29.787
There are jobs like data scientists, which have 95.

00:28:29.787 --> 00:28:34.421
There are jobs that are like on an imminent risk of extinction.

00:28:34.421 --> 00:28:41.935
You know, and we're not, why fire safety engineering is not on imminent danger of extinction by AI.

00:28:41.935 --> 00:28:43.979
What's special about?

00:28:44.098 --> 00:28:49.771
us Alessandro Frassica Really complex tasks to do and not much data on which the model could be trained.

00:28:49.771 --> 00:28:51.852
All programming it can go on the web.

00:28:51.852 --> 00:28:59.994
There is so much code you can train a model on and most of the work done in the engineering field stay in the engineering field.

00:28:59.994 --> 00:29:01.286
You don't write a post I.

00:29:01.286 --> 00:29:07.376
We solve the big project challenge by good business and this is a property of the company.

00:29:07.376 --> 00:29:13.050
To maintain all this knowledge, because that's how what you sell for the next project.

00:29:13.050 --> 00:29:17.205
We have been trying to do and try to increase the company.

00:29:17.205 --> 00:29:22.276
If your model is that injection or a that stand for Archive.

00:29:22.276 --> 00:29:25.548
Augmented Generation will make more and more expert.

00:29:25.548 --> 00:29:31.891
On the safety code, try to write FES code and we are still in the baby stage.

00:29:31.891 --> 00:29:35.278
I will say for some task on a really advanced stage or other.

00:29:35.278 --> 00:29:38.795
We just filled that paper with Amir and probably he can tell us about that.

00:29:39.426 --> 00:29:57.631
Yeah, we'll go deeper in that, but I would like to pull that out Because, if I assume that it has been trained on the entirety of the internet the internet has a lot of resources on fire safety engineering and I assume it has been trained on all of the books of the world, which includes every book that we would be using.

00:29:57.631 --> 00:30:01.766
Therefore, I would argue that quantity is insufficient.

00:30:01.766 --> 00:30:03.428
There must be something else.

00:30:03.428 --> 00:30:17.914
I believe that the quantity of raw data is sufficient, but perhaps there was insufficient examples of turning this data into solutions, Because I think it's also something it needs to train on.

00:30:17.914 --> 00:30:27.795
How did you solve a problem engineering problem, programming problem with the knowledge you had, and then it can follow the patterns, the breadcrumbs that were used.

00:30:27.795 --> 00:30:28.596
I think that's the need.

00:30:28.724 --> 00:30:36.229
Yeah, we don't have a lot of big project solution publicly available for everyone, and so we can't use it to train.

00:30:36.229 --> 00:30:41.528
Instead, in many other fields, the results are already out there, whatever gets generated.

00:30:41.528 --> 00:30:47.951
I don't believe that this is a big limitation, because most of these models are mimicking us.

00:30:47.951 --> 00:30:56.026
They are not yet capable to be okay alumni is letting it be smart now and try to use it in the real world, but they are.

00:30:56.026 --> 00:30:58.673
I will say that the model that we have improved.

00:30:58.673 --> 00:31:05.357
They are still like at the stage that they need to have the real experience and hit against the wall.

00:31:05.744 --> 00:31:12.471
I can give you a fire safety engineering example of how knowledge doesn't mean answer, and it's from my PhD.

00:31:12.471 --> 00:31:18.417
So let's assume you're solving an optimized case for smoke flow in a shopping mall.

00:31:18.417 --> 00:31:23.798
First you have to solve the room of the fire, which means you have to use some sort of plume model.

00:31:23.798 --> 00:31:25.451
Let's say you're using Thomas plume.

00:31:25.451 --> 00:31:28.755
Then you need to get that smoke outside through doors.

00:31:28.755 --> 00:31:32.134
So you need a model for flow outside of the doors, and those are scars.

00:31:32.605 --> 00:31:35.352
There are some approximation, Then the flow under balcony.

00:31:35.352 --> 00:31:37.172
That's completely unsolved.

00:31:37.172 --> 00:31:40.895
There's a very rough, rough approximation by Margaret Lowe.

00:31:40.895 --> 00:31:45.195
There is some stuff in NFPAs but like ridiculously rough.

00:31:45.195 --> 00:31:47.590
There's my PhD, which is in Polish.

00:31:47.590 --> 00:32:06.653
Then you have a plume along the wall which is Harrison Spearpoint and you have suddenly five, six models that you have to connect in a very nice symphony, one fitting another, and there is no single piece of literature that will tell you how exactly to do them for any generic case.

00:32:06.653 --> 00:32:11.468
So I mean, if AI was able to figure that out, I would be very surprised.

00:32:11.468 --> 00:32:16.733
I mean, if you ask it about the Thomas or Harrison Spearpoint work, it's going to tell you about it.

00:32:16.733 --> 00:32:20.462
But to be able to apply that in practice, that's a hell of a challenge.

00:32:20.502 --> 00:32:21.769
I would say, we are not there yet.

00:32:21.789 --> 00:32:22.775
In practice, that's a hell of a challenge.

00:32:22.775 --> 00:32:23.538
I would say we are not there yet.

00:32:23.538 --> 00:32:28.122
That's why we are saying that we don't have general intelligence, because that's something that general intelligence will be capable of.

00:32:28.122 --> 00:32:32.510
We know many humans will probably have the critical thinking to pull together this.

00:32:32.510 --> 00:32:37.134
It requires quite a lot of training also for our human beings to reach the level.

00:32:37.134 --> 00:32:38.766
In fact, you did it for your PhD.

00:32:38.766 --> 00:32:40.569
Phd is not like you go to an undergrad.

00:32:40.569 --> 00:32:41.853
I spent three years on that.

00:32:41.853 --> 00:32:48.474
You don't go to an undergrad and say, hey, tell me that you're certain you don't know how to answer this.

00:32:49.226 --> 00:33:02.353
By definition, if you're doing PhD on something, you are, at that point, probably the most capable human being on the planet doing this exact thing, unless you're doing PhD on something that a hundred other people is pursuing, which is very unlike in the fire science right.

00:33:02.353 --> 00:33:04.132
So, indeed, yeah.

00:33:04.152 --> 00:33:04.271
Amir.

00:33:04.271 --> 00:33:07.491
We can do many steps and probably Amir can tell us more.

00:33:07.644 --> 00:33:21.015
You know, as a solo I think I've been programming more than 15 years and I worked with the AI from the 2022, and worked with a lot of machine learning and deep learning models.

00:33:21.015 --> 00:33:23.032
I said this as a background.

00:33:23.032 --> 00:33:34.954
I wanted to say I believe this we can't use AI for generating the solution Because the AI can't think now.

00:33:34.954 --> 00:33:39.757
For that we are waiting for the AGI artificial general intelligence.

00:33:39.757 --> 00:33:46.377
And the main problem that we have with the AI model is the causality.

00:33:46.377 --> 00:34:03.013
They can't understand the causal relationships between the variable that we have, for example, for the fire simulation, for the fire problem, for the you know, in my area, in the transportation problem or evacuation problem.

00:34:03.013 --> 00:34:14.204
So for that reason, if we wanted to create the solution using the AI for our problems, we need to inject the causal relationships to the models.

00:34:14.204 --> 00:34:20.597
We have some thinking models, like the OpenAI O3 or DIGSIG R1.

00:34:20.597 --> 00:34:30.614
They are working with the reinforcement learning and they tell you yeah, this is the thinking process, but I think this is not the thinking process.

00:34:30.614 --> 00:34:41.512
This is just shrinking the prompt and you know we can spread the prompt, giving your responses, solutions, so we can use AI.

00:34:41.965 --> 00:35:07.605
I believe that we can use AI to create some pipeline, make automations, create some simulation input that we did before or for the, you know, create some consistency during your using the manuals or using the guidelines Because, as you know, for example, we have NFPA 101 as a safety code.

00:35:07.605 --> 00:35:19.976
It has more than 500 pages and it has a lot of complicated texts and graphs and it's even hard to extract accurate data based on our problem.

00:35:19.976 --> 00:35:21.746
So AI can help us.

00:35:21.746 --> 00:35:23.306
Based on our problem, so AI can help us.

00:35:23.306 --> 00:35:33.994
Ai can parse these documents, can interpret plots, tables and, using the rack system that Swena mentioned, we can extract the accurate data from this.

00:35:33.994 --> 00:35:41.579
So AI can help us in this field, but I believe AI can't help us to generate the solution.

00:35:42.460 --> 00:35:47.103
I'll ask you a question and I completely do not expect that you know an answer.

00:35:47.103 --> 00:35:51.226
I'm very happy if you tell me what you feel.

00:35:51.226 --> 00:36:01.728
But when they were training stuff like ChatGPT or Grok there's this library of NFPA as you've mentioned you think it all went into the machine of Learny Learny.

00:36:01.728 --> 00:36:10.958
You think it all went into the machine of Learny Learny, or they somehow stopped the capitalist tense and said no, we're not touching it, it's a protected content.

00:36:10.958 --> 00:36:33.693
Because I have a feeling everything went in and I wonder, like, if I ask you the question about a specific aspect of a code, to what extent it's giving me an answer about directly from the code, or and to what extent it's giving me an answer based on some comments from Facebook from five years ago from random guy.

00:36:33.893 --> 00:36:48.445
It's important that sometimes it can be, you know, the answer of something that was just being trained using someone talking about the code rather than the code itself, so it has been filtered the information of the code by someone else, possibly, although it's not where I run.

00:36:48.726 --> 00:37:00.432
Unless it has been filtered in the wrong way, hopefully, and a good way to make an attempt to see is like to ask okay, can you quote the code that you're using once or twice?

00:37:00.432 --> 00:37:21.353
And I believe that they have put also some protection wherever he's developing those models, because there are also against OpenAI, because people are claiming, ah, they use my book without even asking for permission, and I believe that now you won't even get a honest answer from the model if you say can you please give me the port out of this book?

00:37:23.067 --> 00:37:27.231
I think the filtration happened at the response layer, not at the teaching layer.

00:37:27.231 --> 00:37:35.836
I think they just fed it anything they could and now they're just filtering it to not create a lawsuit-compliant responses to humans.

00:37:36.284 --> 00:37:41.969
That's why data injection and RAD are the way forward for specific field like ours.

00:37:41.969 --> 00:37:44.432
What was data injection, data- injection.

00:37:44.432 --> 00:37:49.780
We talk about the general concept that we have a long-term and short-term memory.

00:37:49.780 --> 00:37:56.257
Yes, that injection is when you just give you all the context you want the answer on in one prompt.

00:37:56.478 --> 00:37:56.617
Okay.

00:37:56.925 --> 00:38:02.936
Basically, you ask tell me about this or that, but this is all the knowledge that you need to use to generate the answer.

00:38:02.936 --> 00:38:09.007
So it's really important with a really big short-term memory so you can inject all the information.

00:38:09.007 --> 00:38:12.219
So the model will try to generate answers based on this.

00:38:12.219 --> 00:38:13.625
Can you do it easily?

00:38:13.625 --> 00:38:15.590
Yeah, you can do it with ChachPT.

00:38:15.590 --> 00:38:25.976
If you create your own GPT, it's desperate using that injection and whenever you ask something, you basically tell the model use all this second knowledge to give me an answer.

00:38:25.976 --> 00:38:40.255
On this specific context, rad is not that accessible if you are at the ABC or general TAI and I guess Amir will do a much better job on explaining RAD, because yesterday they told me about RAD.

00:38:40.255 --> 00:38:41.563
Amir Poudoiioukianoukianoukianoukianou.

00:38:42.007 --> 00:38:44.565
You know, you mentioned before about the hallucination.

00:38:44.565 --> 00:38:50.378
It was the main problem that we faced during the 2023.

00:38:50.378 --> 00:38:55.556
So they produced some wrong references and wrong answers.

00:38:55.556 --> 00:39:06.478
So after that, they developed some methods regarding to AI find answer based on the real document.

00:39:06.478 --> 00:39:10.956
So they develop the rack which we will outland the generation.

00:39:10.956 --> 00:39:16.757
So we say the AI, use this real document to find answers.

00:39:16.856 --> 00:39:19.347
For me, we have two solutions.

00:39:19.347 --> 00:39:25.639
You know, create the AI specialized in the area, for example, in the fire engine.

00:39:25.639 --> 00:39:31.494
We can fine tune the large language model and we can use the RAC method.

00:39:31.494 --> 00:39:46.778
So the RAC method based on my experience, the RAC method works better when we are facing some manuals and guidelines, because we inject the you know the data and the real documents to the AI.

00:39:46.778 --> 00:39:55.344
But in the process of fine-tuning, you should provide some Q&A to the model and after that, fine-tune model based on the data.

00:39:55.344 --> 00:40:08.545
For this I should say, for the first time, we produced a big data set Q&A for the evacuation of fire safety more than 24,000 question and answer within resources.

00:40:08.545 --> 00:40:12.476
Everyone can access it using the Hugging phase.

00:40:13.045 --> 00:40:31.853
When we evaluated the fine-tuned model and the RAC system, we found the RAC works better because during the RAC, you can test different methods, for example methods or inject the direct documents to the AI and you can set the temperature for the model.

00:40:31.853 --> 00:40:36.016
So you can say, just use this document without any creativity.

00:40:36.016 --> 00:40:44.139
You can just use your abilities in the combining tracks to generate the responses.

00:40:44.139 --> 00:40:52.054
So for extracting the accurate data, I think, based on my experience, the RAC works better than the 510 model.

00:40:52.054 --> 00:41:06.893
We fine-tuned the large language model GEMAT-3, for building, for the fire engineering and fire science, so that everyone can access this using the Hugging Face and you know, load it and use it.

00:41:07.125 --> 00:41:12.472
This model is a lightweight model so you can use this.

00:41:12.472 --> 00:41:24.476
But in the process of the RAG and the fine-tuning model for extracting data from the documents and treatment of the hallucination, the main thing is the privacy and cost.

00:41:24.476 --> 00:41:27.570
So it's a trade-off between the privacy and cost.

00:41:27.570 --> 00:41:44.858
If you don't have any concern regarding the copyright or something like this, so you can use the commercial model, like the cloud chat, gpd, and you can send the documents to the server and after that, using the VGA structure, and get responses.

00:41:44.858 --> 00:41:59.965
But if you have based on your data, that you have based on the document that you have, if you have a concept regarding copyright or the privacy, you can use the local models without you know connecting to the concert.

00:42:00.164 --> 00:42:03.896
Yeah, that's exactly where I wanted to get eventually in this episode.

00:42:03.896 --> 00:42:09.478
I mean, we're at the point where ChatGPT could help you a lot with your engineering work.

00:42:09.478 --> 00:42:16.679
Imagine you're writing a fire strategy and you would like to write to do some general summary of what's the building you're working on.

00:42:16.679 --> 00:42:26.349
You could technically upload the whole documentation of that model into ChatGPT, which opens the possibility that in a few years you will end up in jail.

00:42:26.349 --> 00:42:27.311
You cannot exclude that.

00:42:27.311 --> 00:42:28.976
Unfortunately, it's not something.

00:42:28.976 --> 00:42:32.112
I would recommend that you send your confidential data to ChatGPT.

00:42:32.112 --> 00:42:40.929
The same if you're writing a research grant, you should really not upload your whole research grant into ChatGPT to give you a summary, because you're lazy to write it.

00:42:41.391 --> 00:42:56.673
How do we and I understand, observing Professor Reno over the internet that there is this magic way where you just set up your own instance of an AI assistant of some sort that works locally on your machine?

00:42:56.673 --> 00:42:59.186
Reno mentioned you don't even need an internet.

00:42:59.186 --> 00:42:59.947
It's going to work.

00:42:59.947 --> 00:43:02.873
So tell me like it's.

00:43:02.873 --> 00:43:05.137
Basically, you become Google or Meta.

00:43:05.137 --> 00:43:10.137
You said you build up your own language model and use for your needs.

00:43:10.137 --> 00:43:11.449
How does that work?

00:43:11.784 --> 00:43:17.931
The good news is for free, because all the stuff that you use it's open, so you need to download it.

00:43:17.931 --> 00:43:26.032
The true answer is not for free, because you need to have a really good graphic card, especially if you want to have a usable model.

00:43:26.032 --> 00:43:35.331
Just to give you some context, personally, I believe that the local models that are below 10 billion parameters they are usable, but not that great.

00:43:35.331 --> 00:43:39.550
To generate the answer, they can be not great, it depends on the task.

00:43:39.550 --> 00:43:48.096
So that means that you need to have a graphic card with more than 20 bytes of dedicated RAM, and so it's getting expensive.

00:43:48.096 --> 00:43:51.393
That's why, if you see, the price of graphic card is going up.

00:43:51.393 --> 00:43:54.449
We had a spike when there was Bitcoin booming.

00:43:54.449 --> 00:44:01.516
Then the system collapsed because people stole mining coins, and now we are using the same graphic card.

00:44:01.516 --> 00:44:08.244
We were playing all our favorite video games on VR and we can recycle the same graphic cards for AI.

00:44:08.244 --> 00:44:17.364
So what you can do is just install on your computer one of the different user interface if you want to have an extension similar to JGDT.

00:44:17.585 --> 00:44:19.893
I put a lot of my tool links down below in the URL LN studio page assist.

00:44:19.893 --> 00:44:20.635
My favorite is open web UI.

00:44:20.635 --> 00:44:23.824
Put a lot of my tool links down below using LN Studio Page Assist.

00:44:23.824 --> 00:44:25.672
My favorite is Open Web UI.

00:44:25.672 --> 00:44:31.097
It's really flexible, but it comes with a lot of headaches to install it.

00:44:31.097 --> 00:44:36.297
And then you install on your local PC one of those open model.

00:44:36.684 --> 00:44:46.364
My advice is to go to the Ulama list and pick something that will be running on your graphic cards and you can even do an attempt to have a really big model.

00:44:46.364 --> 00:45:00.942
For instance, on my PC I can install easily a 17 video parameters on my dedicated not dedicated RAM, on the traditional RAM that is used with CPU, and you see that the seven video parameters start to actually lie.

00:45:00.942 --> 00:45:08.771
So it became like okay, no, this is not usable, unless I want to have a good answer.

00:45:08.771 --> 00:45:15.152
I give a question and then I go to sleep and the morning after I read the answer, that's what it became.

00:45:15.152 --> 00:45:25.708
So if you want something usable, that is meaning 10, 20 tokens per second, so instead of looking like a GPT answer, then you need to have something while you're on your week.

00:45:26.847 --> 00:45:29.065
But do you have it trained from the scratch?

00:45:29.065 --> 00:45:35.340
Does it come pre-trained with some abilities Like what's the baby stage of that software?

00:45:35.340 --> 00:45:36.284
You install it.

00:45:36.284 --> 00:45:38.664
What can it do since you install it?

00:45:38.664 --> 00:45:41.307
I guess nothing or you can.

00:45:41.608 --> 00:45:44.016
once you have the model, you can start using it just for.

00:45:44.164 --> 00:45:44.545
What does it mean?

00:45:44.545 --> 00:45:45.911
Once you have the model, you can start using it just for.

00:45:45.931 --> 00:45:57.652
What does it mean once you have a model, you install the software and then you need, either through the software or if there are other ways you need to then install the model that you want to use on this software.

00:45:57.652 --> 00:46:01.916
So the software is just a user interface for you that allows you to.

00:46:01.916 --> 00:46:03.610
But where do I get the model?

00:46:03.610 --> 00:46:05.309
The model is free from Hacking Face or Lama List.

00:46:05.309 --> 00:46:06.532
It will allow you to download.

00:46:06.532 --> 00:46:07.235
But where do I get the model?

00:46:07.235 --> 00:46:09.141
The model is free from Hacking Face or Lama List.

00:46:09.141 --> 00:46:11.150
It will allow you to download whatever model.

00:46:12.353 --> 00:46:16.594
And that model is already like responsive pre-trained able to work.

00:46:16.673 --> 00:46:19.900
It's like Chachapiti style.

00:46:19.900 --> 00:46:30.456
It's already answering all the questions and then there are different customizations that you can add to have RADs to have, if a layer you can change, temperature system prompts.

00:46:30.456 --> 00:46:35.690
You have three control on how the model is running and the billions of parameters.

00:46:35.690 --> 00:46:37.394
Do you choose that when downloading the model?

00:46:37.394 --> 00:46:40.635
It's related to the graphic card that you have.

00:46:40.635 --> 00:46:56.211
If you have a graphic card that has four gigabytes of dedicated RAM, I would suggest to not go above four billion parameters, Otherwise it's gonna run on the other RAM, the one that the CPU use, and it's gonna be incredibly slow.

00:46:56.211 --> 00:47:06.813
So of course you can think I'm on the biggest model ever, yeah, but most likely your computer won't be able to write a tablet's yet really dedicated graphic card or cluster.

00:47:07.505 --> 00:47:13.530
And when I want to teach it some fire safety stuff let's say I have a 50 papers by myself.

00:47:13.530 --> 00:47:17.097
I want it to be extremely familiar with my papers what do I do?

00:47:17.405 --> 00:47:18.751
You have two options right.

00:47:18.751 --> 00:47:21.650
That means that you process the data.

00:47:21.650 --> 00:47:24.016
Could be semantic database.

00:47:24.016 --> 00:47:37.052
Semantic database means that you ask about A and the models try to extract information from the existing knowledge close to the A concept, nearly close, and you can say how much close.

00:47:37.052 --> 00:47:47.467
Instead, if you do the fine-tuning, as Amir was saying, that means that you use all this information to create question and answer based on these contents.

00:47:47.467 --> 00:47:58.471
So there is a lot of more processing of the data and then you can have a model that is already trained on this specific information so it will run more free Instead of with the rack prompt.

00:47:58.471 --> 00:48:02.250
You will see a lot of delay because you get the prompt there.

00:48:02.250 --> 00:48:04.996
The prompt gets converted in a sematic meaning.

00:48:04.996 --> 00:48:15.750
When we go in the Docker base, you get all the information related to this semantic stuff that you are getting, bring it back to the general model and then give you an answer.

00:48:15.750 --> 00:48:17.512
So it might take 10, 15 seconds.

00:48:17.905 --> 00:48:19.130
I'm patient enough.

00:48:19.130 --> 00:48:22.134
I downloaded movies from Torrent and it took a week.

00:48:22.134 --> 00:48:23.730
I'm patient enough for this.

00:48:23.730 --> 00:48:28.731
My children are definitely not patient enough for this, but we'll see where it gets in some years.

00:48:28.751 --> 00:48:31.333
Yeah, we came from the internet, good.

00:48:32.114 --> 00:48:41.684
So what level of utility you can get by fine-tuning that Like.

00:48:41.684 --> 00:48:43.349
Can you expect this to be truly supportive to fire safety?

00:48:43.349 --> 00:48:45.155
Engineers' work, daily work, in that case?

00:48:45.706 --> 00:48:53.856
Just before that I should say a good news Olamob yesterday published a software for the Mac and Windows.

00:48:54.545 --> 00:49:05.168
So the easy way is to use the local models is go to Olamob and download the software of the Olamob and choose the model that you want from the olala and choose the model that you want.

00:49:05.168 --> 00:49:26.177
For the first step I propose use the FIVE 3 from the Microsoft 3.8 billion so you can run it for any computer, or GMAT 3, 1 billion or 4 billion, so it's the first thing that you can work with the local models.

00:49:26.177 --> 00:49:29.460
From the fine-tuning side.

00:49:29.460 --> 00:49:33.581
It's a little hard to describe the whole process.

00:49:33.581 --> 00:49:43.007
You know as an engineer, not as a computer science, but you know.

00:49:43.007 --> 00:49:58.394
Imagine you have a brain that's trained on the general data but you wanted to inject some specific data to generate, for example, responses regarding the fire science without using the external documents.

00:49:58.394 --> 00:50:21.199
So when you're fine-tuning 4 billion models for the fire engineer so you don't need to upload any documents to that you should create a database question and answer real question and expected answer that you need to get from the Mars language model.

00:50:21.824 --> 00:50:28.635
So, you should create this database and after that inject this database and train the model based on that.

00:50:28.635 --> 00:50:37.380
So, as I've said before, we created 24,000 questions and answered database and after that fine-tuned the model.

00:50:37.380 --> 00:50:57.112
So it's a little hard because you as an expert that you wanted to fine-tune a model that you wanted to find to you on a model, you should read the document and after that extract the relevant question and create a good answer for that question and ask the dedicated database for it.

00:50:57.112 --> 00:51:01.675
There are some methods to create this database that I did.

00:51:01.675 --> 00:51:04.373
You can use active learning.

00:51:04.373 --> 00:51:08.565
You can use the AI to generate question and answer.

00:51:08.565 --> 00:51:26.974
There are some tools for that, but, based on the results of my dissertation that I did two days ago, the rack system in the evacuation problem works better than the finder, but I used based on the GPU that I had.

00:51:26.974 --> 00:51:29.835
I'd use the GemR3 4 billion parameters.

00:51:29.835 --> 00:51:38.574
Maybe if I fine-tune the model for example, 12 billion parameters the fine-tune model works better than the rack.

00:51:39.025 --> 00:51:45.914
I was saying that we need to acknowledge also the work done with Mike Speedpoint and Pete Lawrence from Great Junies.

00:51:45.914 --> 00:51:55.954
We have done a lot of things also to ask fire engineers okay, that's the answer that you get out of the mouth of our rug what do you think about the quality of the answer?

00:51:55.954 --> 00:52:00.916
And in the paper that I published they were saying no, not that good.

00:52:00.916 --> 00:52:02.891
Oh, actually, this solution is good.

00:52:02.891 --> 00:52:08.998
And we were also forcing the model to provide a comment on what was the reason it was getting these results.

00:52:08.998 --> 00:52:13.996
So it's good to see that for some tasks they were like that's not bad.

00:52:15.126 --> 00:52:18.737
I mean, if you're talking about spear point response, that's a very high evaluation.

00:52:18.737 --> 00:52:25.338
If he tells you it's not bad like that, that's probably somewhere like five top percentile.

00:52:25.625 --> 00:52:28.335
I trained Amir on Mike as well, before working with him.

00:52:28.786 --> 00:52:37.697
I am not sure if the world is ready for Spearpoint GPT, but it would definitely be very, very interesting.

00:52:37.697 --> 00:52:44.318
Okay, guys, I'll try to put some papers in the show notes.

00:52:44.318 --> 00:52:52.014
I see Amir's paper on enhancing occupant evacuation simulations, the one that you were referring to.

00:52:52.014 --> 00:52:53.572
The last year paper was which one?

00:52:54.264 --> 00:52:57.896
No, it was the paper that we worked with that you said.

00:52:57.896 --> 00:53:01.054
This year we presented the ASC.

00:53:02.387 --> 00:53:07.514
And Nancy occupant evacuation simulation using LLMs and retrieve augmented generation.

00:53:08.065 --> 00:53:10.666
Fantastic, good, good, good Guys.

00:53:10.666 --> 00:53:24.688
I mean it's great to catch up on the modern technology and I think we've provided fire safety community a good wrap up of what's been happening in the last years, what the technology looks like, and I think it's.

00:53:24.688 --> 00:53:42.985
I mean, we've not explicitly spoken about the future and how exciting it is, but it's kind of obvious, like I mean, it's obvious where we're heading and this general AI, agi this is going to be very, very interesting if it eventually happens.

00:53:42.985 --> 00:53:54.931
Even getting close will create some very interesting times, and let's hope that the ripples that scatter through the world are not too damaging to fire safety engineering and actually enhance our capability to deliver great engineering.

00:53:54.931 --> 00:54:08.474
I would love for AI to take the very boring parts of my job and allow me to do creative and interesting parts of my jobs, though artists thought the same, and then mid-journey came, so I'm like not so sure.

00:54:08.474 --> 00:54:17.820
Any final words of encouragement or perhaps warnings that you want to say at the end, rino?

00:54:28.344 --> 00:54:35.652
no-transcript, and I believe that we need to start integrating more and more these courses within the Far Forthright Engineering courses.

00:54:35.652 --> 00:54:40.536
It's like not having sex in vocational school or figuring out yourself.

00:54:40.536 --> 00:54:51.099
It's the same, like here, there are a lot of risks if you don't know all the possible things that you can get bad and good, if you just go wild and you explore by yourself.

00:54:51.099 --> 00:55:01.778
Like we were talking about copyrights, you don't want to end up uploading, on a language model, the SFPN book and you don't have the rights to do it because you don't have the copyrights.

00:55:01.778 --> 00:55:05.110
You have the rights to have a copy of it and use the copy of.

00:55:05.110 --> 00:55:13.034
That Doesn't mean that you have now the rights to resell this FPM book or get it to another company to make use of it.

00:55:13.034 --> 00:55:19.356
So it's we need to start discussing about all the good things, the bad things and be open about it.

00:55:19.356 --> 00:55:19.880
Henry Suryawirawan.

00:55:19.981 --> 00:55:21.304
Yep Good Amir.

00:55:21.304 --> 00:55:26.050
Anything from your AI journey that you'd like to share with people?

00:55:26.050 --> 00:55:26.351
That Amir.

00:55:26.371 --> 00:55:50.347
Ibrahimov, I want to just say a general thing to the engineers Don't be anti-AI engineers, because AI is a tool that we can make our works easier, so we can use it, but we should maintain our creativity as a human, as a researcher and as an engineer.

00:55:50.347 --> 00:56:01.413
For example, I don't like to generate text or report using the AI because I want it to read my brain and actions, not from the AI.

00:56:01.413 --> 00:56:06.565
But AI it's just a tool, like the programming language.

00:56:06.565 --> 00:56:13.268
We can use it to make our work and, you know, create work easier.

00:56:13.268 --> 00:56:18.032
So don't be anti-AI engineers.

00:56:18.211 --> 00:56:28.637
Use these tools and enjoy that, yeah and I can second that and I resonate with what you said, reno, about ethical use and efficient use and just useful use of AI.

00:56:28.637 --> 00:56:30.778
I wonder are you working on that?

00:56:30.778 --> 00:56:35.081
In Massey, like you're an academic, you say things that AI should be integrated in teaching.

00:56:35.081 --> 00:56:36.601
Are you integrating AI in teaching?

00:56:37.061 --> 00:56:47.793
Pushing, pushing, pushing, pushing, and sometimes universities are too slow to reshuffle themselves to integrate new knowledge.

00:56:47.793 --> 00:56:49.943
I believe that is going to be a good selling point for many degrees.

00:56:49.943 --> 00:57:07.137
If we start properly teaching people the basics about what is AI or can they use and how to optimize what used to be done with Excel or done in a simple manual way, so that will give the people a much higher hedge manual way.

00:57:07.137 --> 00:57:09.164
So that will give the people a much higher hedge.

00:57:09.164 --> 00:57:14.398
And I'm afraid some universities are too afraid, too scared, to understand what is the technology.

00:57:14.398 --> 00:57:23.965
And I understand where they're coming from, because they don't want to be sued, they don't want this and that, but I am one of the of the person here in Massey creating problems.

00:57:24.106 --> 00:57:28.331
They say hey, let's go hey, let's do, let's do it, I can do it tomorrow guys.

00:57:29.106 --> 00:57:38.173
And Amir, with what you said, I think opposing AI today is like opposing sewing machines or steam engines in 19th century.

00:57:38.173 --> 00:57:49.512
And another parallel that comes in front of my head is my maths teacher telling me to learn how to multiply in my head, because I'm not going to carry a calculator with me my entire life.

00:57:49.512 --> 00:57:51.137
Right, haha, joke's on you.

00:57:51.137 --> 00:57:57.833
So, like the future is interesting, we're living in very interesting times.

00:57:57.833 --> 00:58:04.074
I've heard that it's an old Chinese curse, so I hope the times do not become too interesting.

00:58:04.074 --> 00:58:05.952
Thank you guys for coming to the Fire Science Show.

00:58:05.952 --> 00:58:09.030
We at least made this hour of the day of our listeners more interesting.

00:58:09.030 --> 00:58:10.652
I hope the times do not become too interesting.

00:58:10.652 --> 00:58:11.835
Thank you, guys for coming to the Fireside Show.

00:58:11.835 --> 00:58:13.478
We at least made this hour of the day of our listeners more interesting, I hope.

00:58:13.478 --> 00:58:14.380
Thanks, guys, thank you so much, thank you.

00:58:14.400 --> 00:58:15.541
Thank you so much, and that's it.

00:58:15.541 --> 00:58:16.302
Thank you for listening.

00:58:16.302 --> 00:58:24.737
Since we've recorded this episode a week ago, there has been at least three new releases of large language models.

00:58:24.737 --> 00:58:42.257
Like every big player in the market is releasing a new model this week, it seems, so what a crazy timeline to be live, and it becomes quite a scattered environment really, and even with chat, gpt, open, ai, you have so many different models to pick from.

00:58:42.257 --> 00:58:52.693
I think we start to be more and more competent on which models are better for particular uses, and that's something you can get through training and trying out.

00:58:53.255 --> 00:59:07.516
One important take of this episode and truly the reason why I wanted this episode in in the fire science show is that they've shared some concepts about how to set up your own LLM running on your local computer.

00:59:07.516 --> 00:59:58.635
It's not going to run on your very basic laptop if you have one that you just used to write, but if you have something akin of a gaming PC or a gaming laptop, then probably your chances are that you will have good enough graphics cards to run an LLM and if you have that capability, I think it's quite interesting to try and set up your own instance of large language model and with that instance you can privately play so you can upload your reports, you can upload your notes, you can upload your data and see what the model can get out of that, and by doing that you can start training, you can start improving and eventually perhaps you can create something that actually helps you with your workflows.

00:59:58.635 --> 01:00:01.469
This is something that has been going through my head for a long time.

01:00:01.469 --> 01:00:11.835
I had many discussions with many colleagues about how to do it, and usually we've stopped at the point of privacy, like you really cannot upload projects to the Internet.

01:00:11.835 --> 01:00:21.260
Another thing that I'm curious about is the capability of this AI to read technical drawings of the buildings.

01:00:21.260 --> 01:00:26.797
So I'm not sure what it will do with AutoCAD drawings.

01:00:26.797 --> 01:00:30.532
I'm not sure if it can take a Revit file and understand.

01:00:30.614 --> 01:00:40.972
Perhaps a Revit integrated AI that could run on your own computer with like privacy protection is something that could make a big difference in the industry.

01:00:40.972 --> 01:00:46.733
I'm not sure if it's Well okay, I'm pretty sure it's being developed by someone because it's not the most innovative idea of the year.

01:00:46.733 --> 01:00:48.514
I'm pretty sure it's being developed by someone because it's not the most innovative idea of the year.

01:00:48.514 --> 01:00:52.985
I'm pretty sure someone's working on that and I hope they succeed.

01:00:52.985 --> 01:01:00.166
If you do that, I hope you succeed and we get some interesting ai supportive tools for our engineering workflows.

01:01:00.166 --> 01:01:04.373
Um, that would be it for today's episode of the fire Science Show.

01:01:04.934 --> 01:01:08.320
A link to the Amir's paper is in the show notes.

01:01:08.320 --> 01:01:10.467
All Amir link is in the show notes.

01:01:10.467 --> 01:01:27.536
You can use that to start and see where you get with your own large language model journey on your own local instance, with good privacy and not sending stuff to some Altman, which you probably do not want to do.

01:01:27.536 --> 01:01:44.385
Thanks for being here with me in the human-arranged, human-recorded and non-AI-created podcast delivering imperfect human content to you every Wednesday.

01:01:44.385 --> 01:01:50.726
Yours truly, wojciech Cheers, see you here next week, wednesday.

01:01:50.726 --> 01:01:51.427
Another financial episode.

01:01:51.427 --> 01:01:51.907
Bye-bye, bye.