Forum in maintenance, we will back soon 🙂
@google-rayazsiddiqi, We are just getting all h2 headings with the class "'listing-company" and then looping and printing them.
h2 is heading 2 in HTML, and classes are usually used to identify and select specific tags and objects in a web pageÂ
does this make sense?
@google-rayazsiddiqi I plan to create an example for you. I need time to do that.
Regards,
Earnie Boyd, CEO
Seasoned Solutions Advisor LLC
Schedule 1-on-1 help
Join me on Slack
I am struggling to understand how the job title comes under listing company!!
@google-rayazsiddiqi I've modified the code in the Solution of the training page to help explain what is happening. I hope this clears up your confusion.
import requests from bs4 import BeautifulSoup # Send a GET request to the Python job board response = requests.get('https://www.python.org/jobs/') # Parse the content of the response with BeautifulSoup soup = BeautifulSoup(response.content, 'html.parser') # Find all the job posts job_posts = soup.find_all('h2', class_='listing-company') # Print the title of each job post # Loop through all of the job_posts for job_post in job_posts: # job_post contains all the elements of the <h2> tag print(f"JOB_POST: {job_post}\n") # job_post.a is the <a href="..."> tag print(f"JOB_POST.A: {job_post.a}\n") # job_post.a.text is the text portion of the link. title = job_post.a.text print(f"JOB_POST.A.TEXT: {title}\n")
Regards,
Earnie Boyd, CEO
Seasoned Solutions Advisor LLC
Schedule 1-on-1 help
Join me on Slack
@ssadvisor I think I need 121 help on this, I'll schedule something in your diary!!
@ssadvisor I think this is a lot clearer, couple of things:
What do the f" and n"in this code do: print(f"JOB_POST: {job_post}\n")
Also, where do you define the variable job_post?
Â
Thanks
@google-rayazsiddiqi the f is to format the variable you want to print and replace the {job_post} with a predefined variable.
The \n is to go to a new line.
@husein Thanks, I think this makes a lot more sense now. I'll have a play around with another website to see if I can do it myself!
I tried this on another website, see HTML below:
Â
THis is my code:
And this is the output, it printed all the text, how do I modify to print just the job title?
JOB_POST.TEXT: The Thames Valley
JOB_POST.TEXT: We have two Trustee vacancies we wish to fill, to complement our existing Board and guide our successful and growing organisation into the next phase of development.
JOB_POST.TEXT: June 30, 2024
JOB_POST.TEXT: Voluntary
JOB_POST.TEXT: Full Time
JOB_POST.TEXT: Clinks
82A James Carter Road
Mildenhall
Suffolk
IP28 7DE
020 4502 6774
[email protected]
Clinks is a registered charity no. 1074546 and a company limited by guarantee, registered in England no. 3562176
JOB_POST.TEXT:
Â
@google-rayazsiddiqi you have an extra character in your code; it should be "class" instead of "class_".
Regards,
Earnie Boyd, CEO
Seasoned Solutions Advisor LLC
Schedule 1-on-1 help
Join me on Slack
@ssadvisor I chnaged it and I got a syntax error...strange!
PS C:\Users\rayaz> & D:/Python/python.exe "c:/Users/rayaz/test file.py"
File "c:\Users\rayaz\test file.py", line 11
job_posts = soup.find_all('div', class='field-item')
^^^^^
SyntaxError: invalid syntax