Forum in maintenance, we will back soon 🙂
token_limit.py
Hi Hasan,
Why do you put the following token_limit.py code block in parentheses?
# Split transcript into chunks if needed
chunks = (
 helpers.split_text_into_chunks(video_transcript, helpers.max_tokens_per_chunk)
 if helpers.count_tokens(video_transcript, selected_model)
 > helpers.max_tokens_per_chunk
 else [video_transcript]
)
Thank you in advance.
This code block is using a Python feature called the ternary (or conditional) operator. The ternary operator is a shorthand way of writing an if-else
statement. It allows you to evaluate a condition and return one value if the condition is true, and another value if the condition is false. Here's a breakdown:
helpers.count_tokens(video_transcript, selected_model) > helpers.max_tokens_per_chunk
is the condition.helpers.split_text_into_chunks(video_transcript, helpers.max_tokens_per_chunk)
is the value that will be assigned tochunks
if the condition is true.[video_transcript]
is the value that will be assigned tochunks
if the condition is false.
The reason this code block is wrapped in parentheses is to improve readability. The parentheses make it clear that the entire expression is one statement. Without them, Python might misinterpret where the statement ends, especially when the ternary operator is part of a larger expression or if the code is formatted across multiple lines. It also makes it clear to anyone reading the code that this is a single logical operation, rather than multiple separate statements.
@admin hello sir I am facing this issue how to solve this in token limit?
Does the helper file contain max_token_per_chunk?
@admin sir i have this code on helpers file
"import openai
import tiktoken
import newspaper
import youtube_transcript_api
import re
from youtube_transcript_api import YouTubeTranscriptApi
# replace with your api key
openai.api_key = "sk-Mt9bKcl5G0ePQxCqAT3BlbkFJZMrNYkzaesj0pQDO8iem"
def estimate_input_cost(model_name, token_count):
if model_name == "gpt-3.5-turbo-0613":
cost_per_1000_tokens = 0.0015
if model_name == "gpt-3.5-turbo-16k-0613":
cost_per_1000_tokens = 0.003
if model_name == "gpt-4-0613":
cost_per_1000_tokens = 0.03
if model_name == "gpt-4-32k-0613":
cost_per_1000_tokens = 0.06
estimated_cost = (token_count / 1000) * cost_per_1000_tokens
return estimated_cost
def count_tokens(text,selected_model):
encoding = tiktoken.encoding_for_model(selected_model)
num_tokens = encoding.encode(text)
return len(num_tokens)
def generate_text_with_openai(user_prompt):
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo", # you can replace this with your preferred model
messages=[{"role": "user", "content": user_prompt}],
)
return completion.choices[0].message.content
def get_article_from_url(url):
try:
# Scrape the web page for content using newspaper
article = newspaper.Article(url)
# Download the article's content with a timeout of 10 seconds
article.download()
# Check if the download was successful before parsing the article
if article.download_state == 2:
article.parse()
# Get the main text content of the article
article_text = article.text
return article_text
else:
print("Error: Unable to download article from URL:", url)
return None
except Exception as e:
print("An error occurred while processing the URL:", url)
print(str(e))
return None
def get_video_transcript(video_url):
match = re.search(r"(?:youtube\.com\/watch\?v=|youtu\.be\/)(.*)", video_url)
if match:
VIDEO_ID = match.group(1)
else:
raise ValueError("Invalid YouTube URL")
video_id = VIDEO_ID
# Fetch the transcript using the YouTubeTranscriptApi
transcript = YouTubeTranscriptApi.get_transcript(video_id)
# Extract the text of the transcript
transcript_text = ""
for line in transcript:
transcript_text += line["text"] + " "
return transcript_text"
Now please can you tell me what to do here?
@kunal-lonhare where is "max_token_per_chunk"
you can use the code block here to paste source codes next time.
and make sure you are not sharing the real API key
Hello Hassan, or others in the know! I have a problem with "token_limit.py" where I don't know what to do.
Here is the error message:
Traceback (most recent call last):
File "c:\Users\gpiuk\Documents\GitHub\prompts\token_limit.py", line 31, in <module>.
print(overall_output)
file "C:\Users\gpiuk\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: The codec 'charmap' cannot encode the character '\U0001f4a1' at position 2: The character corresponds to <undefined>.
It would be great if you could help me.
Greetings from Austria
Gerald
@gerald-piuk can you please screenshot your code, or share it here, so we can check it.
Hi Hassan,
THANK YOU for your quick offer of help!
For some reason this is working now, the bad thing is that I don't know why but it is working!
THANK YOU again and you are doing great!
Best regards from the alpine republic!
Gerald
never mind - it was there - I overlooked it. It sais max_tokens_per_chunk = 1000ÂPosted by: @shelaheramis@admin it's not in the helper file
Â