diff --git a/README.md b/README.md
index 7233a6f..a7634d7 100644
--- a/README.md
+++ b/README.md
@@ -405,6 +405,47 @@ In this methodology, although the model is still guided by instructions (e.g., "
+
+#### Steps for Inference
+
+ **Prepare the Model for Inference**:
+ The fine-tuned model is loaded using `FastLanguageModel.for_inference`, ensuring compatibility with Unsloth's inference pipeline.
+
+ **Define User Inputs**:
+ Input messages are defined explicitly to avoid unnecessary system messages.
+ **Example Input**:
+ `"ಪರಿಸರದ ಬಗ್ಗೆ ಬರೆಯಿರಿ ಮತ್ತು ಪ್ರಬಂಧವನ್ನು ಬರೆಯಿರಿ."` (Write an essay about the environment.)
+
+ **Tokenization and Formatting**:
+ The input is tokenized using the `tokenizer` with the following options:
+ - **`tokenize=True`**: To convert text into tokens.
+ - **`add_generation_prompt=True`**: Ensures generation starts from the assistant's perspective.
+ - **`return_tensors="pt"`**: Outputs PyTorch tensors for model compatibility.
+
+ **Generating Responses**:
+ The fine-tuned model generates a response with:
+ - **`max_new_tokens=1024`**: Defines the maximum number of tokens in the output.
+ - **`temperature=1.5`**: Adds randomness to the output for creative generation.
+ - **`min_p=0.1`**: Filters out less probable tokens to improve relevance.
+
+ **Decoding and Post-Processing**:
+ Outputs are decoded and cleaned by removing unwanted metadata or system messages. This ensures the response is concise and focused.
+
+ **Output Example**:
+ The generated response aligns with the instruction, such as providing a detailed essay on the environment in the Kannada language.
+
+
+
+
+
+
+
+
+
+
+
+
+
### Saving the Model & Tokenizer