Data Science from Zero to Hero: Python for Data Science Session1
- Jitendra Singh
- Jan 26
- 8 min read
To start learning python, make sure you have installed Anaconda / Jupiter note book on your systems, Click here if you need help with same and we will start with few of the basic python concepts. Don’t worry we will be doing some practical to understand the same. You are good to practice everything which is offered https://www.w3schools.com/python/ but topics being shared here are great for you to Start your data Science Journey
• Python Variable types
• Reassign variable
• Python data types
• Create numeric & string variables
• Checking data types with type () function
• Data type conversion
• Type casting
• Functions in python
• Commenting code in python
• Python Operators
• Runtime variable
• Data slicing
• String Handling
• Python Flow Control (Conditional Statements, Loops, Function Calls )
• Pseudocode
• Python Objects (Container or collection objects)
• list operations
• Tuple Operations
• Dictionary operations
• Built-in dictionary functions
• Loop through the dictionary object
• Sets Operations
Python Variable Types
Variables are names bound to objects in Python. Unlike statically typed languages, Python uses dynamic (or duck) typing: a variable name only refers to an object, and objects carry type information.
Variables are references to objects in memory. Types are tied to values (objects), not to variable names. You can inspect the identity using id() and the type using type(). Understanding that variables are references helps explain behavior when mutating objects (e.g., lists) vs. reassigning names.
# Examples: variable names referencing objectsa = 10 # 'a' references an int object with value 10b = a # 'b' now references the same int objectprint(id(a), id(b)) # same id for small ints due to caching (implementation detail)a = a + 1 # rebind 'a' to a new int object (immutable)print(a, b)# Mutable object examplelst1 = [1, 2, 3]lst2 = lst1 # both names reference the same listlst2.append(4)print(lst1) # prints [1,2,3,4] because list is mutable and shared referenceReassign Variable
Reassignment changes what object a variable name refers to. Because Python variables are labels for objects, reassigning doesn't mutate the original object (unless the object itself is mutated).
Because of dynamic typing, you can assign values of different types to the same variable name at different times. While flexible, excessive reassigning to different types can make code harder to read and error-prone—prefer consistent use within a given scope.
x = 42 # integerprint(type(x), x)x = "now a string" # reassign to stringprint(type(x), x)# Reassigning complex objectsconfig = {"mode": "dev"}config = ["dev", "staging", "prod"] # same name now refers to a listPython Data Types
Python provides several built-in data types: numeric (int, float, complex), sequence (str, list, tuple, range), mapping (dict), set, and boolean. There are also user-defined classes and many specialized types in the standard library.
Important differences: immutability vs mutability (e.g., tuples and strings are immutable; lists and dicts are mutable). Immutability provides safety in hashed collections and easier reasoning about state, while mutability is more memory-efficient for large, changing datasets.
# Numeric types
i = 10f = 3.14c = 1+2j# Sequencess = "hello"lst = [1,2,3]t = (1,2,3)# Mapping and setd = {"a": 1}st = {1,2,3}Create Numeric & String Variables
Creating numeric and string variables is straightforward. Python automatically assigns the appropriate object type.
Be mindful of numeric literal formats (binary 0b, octal 0o, hex 0x), underscores in numerics for readability, and raw strings for backslashes in paths or regex.
# Numeric literals and underscores
million = 1_000_000binary_val = 0b1010hex_val = 0xFF# Strings and raw strings
normal = "Hello\nWorld"raw = r"Hello\nWorld" # backslash not interpretedChecking Data Types with type() Function
Use type(obj) to get the class/type of an object. For type checking, prefer duck typing or isinstance(obj, Class) rather than comparing types directly; isinstance supports inheritance and is more flexible.
Note: type() comparison (type(x) is int) fails with subclasses; isinstance(x, numbers.Real) supports abstract base classes.
from numbers import Real
x = 3print(type(x)) # <class 'int'>print(isinstance(x, int)) # Trueprint(isinstance(x, Real)) # True for int and floatData Type Conversion / Type Casting
Python provides explicit conversion functions: int(), float(), str(), bool(), list(), tuple(), set(), dict(). Conversion may fail (ValueError) if the input string doesn’t represent the target type.
Implicit conversions also occur in numeric expressions (e.g., int + float -> float). Be deliberate with conversions: converting large floats to int truncates toward zero; converting non-numeric strings will raise exceptions.
# Examples of conversions
s = '123'n = int(s)f = float(s)print(n + 10, f * 1.5)# Converting containers
t = (1,2,3)print(list(t)) # [1,2,3]# Edge case: invalid conversion
try: int("12.3") # ValueError: invalid literalexcept ValueError as e: print("Can't convert:", e)Functions in Python
Functions encapsulate reusable logic. Define with def or lambda (anonymous). Functions are objects and can be passed around, returned, and stored. Type hints (PEP 484) are optional but useful for documentation and tooling.
Key concepts: positional vs keyword arguments, default arguments, *args/**kwargs for variable arguments, closures, and decorators for higher-order behavior.
# Basic function with return and type hints
def add(a: int, b: int) -> int: """Return sum of a and b.""" return a + bprint(add(2,3))# Args and kwargs, default values
def greet(name, *titles, sep=' ', **attrs): print(f"Hello {sep.join((name,)+titles)}; attrs={attrs}")greet("Dr. Alice", "PhD", sep=", ", role="instructor")Commenting Code in Python
Use # for single-line comments. Triple-quoted strings are often used as docstrings for modules, classes, and functions (accessible via __doc__). Comments should explain why — not what — when the code is non-obvious.
Docstrings follow conventions (PEP 257) and are used by tools like Sphinx. Keep comments up-to-date; stale comments are worse than none.
# Single-line comment explaining intent
x = x + 1 # increment counterdef func(): """Short docstring: describe function purpose, args, and return.""" passPython Operators
Operators include arithmetic (+, -, *, /, //, %, **), comparison (==, !=, <, >), logical (and, or, not), assignment (=, +=), identity (is, is not), and membership (in, not in).
Operator precedence matters; use parentheses to clarify. The 'is' operator checks object identity, not equality; use '==' for value equality.
# Examples
a, b = 7, 3print(a // b) # floor division = 2print(a ** b) # exponentiation = 343print(a == 7) # Trueprint(a is 7) # identity (implementation detail for small ints)Runtime Variables (Input, Environment, CLI)
Runtime variables are values provided while a program runs: user input (input()), environment variables (os.environ), command-line arguments (sys.argv or argparse).
Always validate and sanitize runtime input; it can be malformed, missing, or malicious. For CLI programs, prefer argparse for robust argument parsing and help messages.
# Reading user input
name = input('Enter name: ')age_s = input('Enter age: ')try: age = int(age_s)except ValueError: print('Invalid age')# Reading environment variables
import osdb_url = os.getenv('DATABASE_URL', 'sqlite:///local.db')Data Slicing (Strings, Lists, Ranges)
Slicing extracts subsequences: seq[start:stop:step]. Start is inclusive, stop is exclusive. Omitting indices uses defaults (start=0, stop=len). Negative indices count from end.
Slicing returns a new object for sequences like list and str. For large slices, avoid copying unnecessarily (use iterators) if performance/memory matters.
s = 'PythonProgramming'print(s[0:6]) # 'Python'print(s[6:]) # 'Programming'print(s[-11:-1]) # using negative indicesprint(s[::2]) # every second characterlst = [0,1,2,3,4,5]print(lst[1:4]) # [1,2,3]print(lst[::-1]) # reversed copyString Handling
Strings are immutable sequences of Unicode code points. Common operations include concatenation, formatting (f-strings, format(), %), searching, splitting, and regular expressions. For heavy string building, use list join pattern to avoid quadratic behavior.
Be mindful of encoding when working with bytes and I/O: convert between str and bytes using encode/decode with a known encoding (utf-8 preferred).
# f-strings and formatting
name = 'Alice'g = f'Hello, {name.upper()}'print(g)# Efficient building
parts = []for i in range(5): parts.append(str(i))result = ','.join(parts)print(result)# Bytes and encoding
text = 'café'b = text.encode('utf-8')print(b)print(b.decode('utf-8'))Python Flow Control (Conditionals, Loops, Function Calls)
Control flow directs execution through conditionals (if/elif/else) and loops (for/while). For loops iterate over iterables; while loops use a condition. Use break/continue to control loop flow.
Prefer 'for item in iterable' over index-based loops when possible. Use generator expressions and iterators for memory-efficient pipelines. Function calls encapsulate behavior and should be kept small (single responsibility).
# Conditional
x = 10if x < 0: print('negative')elif x == 0: print('zero')else: print('positive')# Loop and function call
def is_prime(n): if n < 2: return False for i in range(2, int(n**0.5)+1): if n % i == 0: return False return Truefor num in range(2, 20): print(num, is_prime(num))Pseudocode
Pseudocode is a language-agnostic way to express algorithms before coding. It helps plan edge cases, data structures, and complexity.
Good pseudocode focuses on clarity: describe intent with simple constructs. Once validated, translate into Python, keeping variable names consistent with the pseudocode.
# Example pseudocode for computing factorial:
# 1. Read n# 2. result = 1# 3. For i from 2 to n:# result = result * i# 4. Print result# Translated to Python:
def factorial(n): result = 1 for i in range(2, n+1): result *= i return resultPython Objects (Container or Collection Objects)
Collection objects aggregate multiple values: lists, tuples, sets, dicts. Choose based on needs: orderable sequence (list), fixed collection (tuple), unique elements (set), key-value mapping (dict).
Understand time complexities: list append O(1) amortized, dict/set average O(1) for lookup/insert, list membership O(n). These influence data structure choice in algorithms.
# Examples
lst = [1,2,3]tup = (1,2,3)st = {1,2,3}d = {'a':1, 'b':2}List Operations
Lists are mutable ordered sequences. Common operations: append, extend, insert, pop, remove, clear, index, count, sort, reverse. Use list comprehensions for concise transformations.
Be cautious: list multiplication with nested lists can create shared references. For large data, prefer generator expressions to avoid memory spikes.
# Common list operations
fruits = ['apple', 'banana']fruits.append('cherry')fruits.extend(['date', 'elderberry'])fruits.insert(1, 'blueberry')print(fruits)print(fruits.pop()) # remove lastprint(fruits.remove('banana')) # returns None, mutates list# List comprehensions vs generatorsquares = [i*i for i in range(10)]gen = (i*i for i in range(10))Tuple Operations
Tuples are immutable ordered sequences. They are useful as keys in dictionaries or to represent fixed records. You can unpack tuples into variables and use them as return values from functions.
Though immutable, a tuple can contain mutable objects (like lists), which can be mutated in place—only the tuple's structure is immutable.
# Tuple examples
record = ('Alice', 30, 'Engineer')name, age, role = record # unpackingprint(name, age, role)# Tuple with mutable item
t = (1, [2,3], 4)t[1].append(5)print(t)Dictionary Operations
Dictionaries map hashable keys to values. Keys must be immutable and hashable (strings, numbers, tuples of immutables). Common operations: access, setdefault, get, update, pop, popitem, clear.
For ordered behavior, since Python 3.7 insertion order is preserved. For specialized mappings, use collections.OrderedDict (older versions), defaultdict, or ChainMap.
# Basic dictionary usage
person = {'name': 'Bob', 'age': 25}print(person['name'])print(person.get('city', 'Unknown'))person['age'] = 26person.setdefault('country', 'India')# Merging dictionaries (Python 3.9+)
a = {'x': 1}b = {'y': 2}merged = a | bprint(merged)Built-in Dictionary Functions
Useful dict methods: keys(), values(), items(), get(), pop(), popitem(), clear(), update(), setdefault(). Many of these return view objects that reflect changes to the dict.
Iterating dicts: iterating directly yields keys; use items() to get key-value pairs. For performance, avoid repeated dict lookups in loops—cache lookups in local variables if needed.
d = {'a':1, 'b':2}print(list(d.keys()))print(list(d.values()))print(list(d.items()))# Using get with default
print(d.get('z', 0))Loop through the Dictionary Object
Loop patterns: for key in d, for key, val in d.items(), and for val in d.values(). When mutating a dictionary while iterating, iterate over a copy (list(d.items())) to avoid runtime errors.
Use enumerate() when you need an index while iterating items; use sorted(d.items()) if you need deterministic order by key.
d = {'a':1, 'b':2, 'c':3}for k, v in d.items(): print(f'{k} -> {v}')# Safe mutation example
for k in list(d.keys()): if k == 'b': del d[k]Set Operations
Sets represent unordered collections of unique elements. They support union, intersection, difference, symmetric_difference, and subset/superset checks. Because sets are hash-based, elements must be hashable.
Use sets for deduplication, membership tests, and set algebra. frozenset provides an immutable variant usable as dict keys.
a = {1,2,3}b = {3,4,5}print('union', a | b)print('intersection', a & b)print('difference', a - b)print('symmetric', a ^ b)# Using sets to deduplicate a list
nums = [1,2,2,3,3,3]unique = set(nums)print(unique)This is pretty basic excercise to do little practice to understand Python and to start with.
here is a jupiter notebook attached for your reference but i would suggest you try the code your self in next coming week, we will go little more deep into python in our upcoming Series posts





Comments