top of page

Data Science from Zero to Hero: Python for Data Science Session1

  • Writer: Jitendra Singh
    Jitendra Singh
  • Jan 26
  • 8 min read


To start learning python, make sure you have installed Anaconda / Jupiter note book on your systems, Click here if you need help with same and we will start with few of the basic python concepts. Don’t worry we will be doing some practical to understand the same. You are good to practice everything which is offered https://www.w3schools.com/python/ but topics being shared here are great for you to Start your data Science Journey


• Python Variable types

• Reassign variable

• Python data types

• Create numeric & string variables

• Checking data types with type () function

• Data type conversion

• Type casting

• Functions in python

• Commenting code in python

• Python Operators

• Runtime variable

• Data slicing

• String Handling

• Python Flow Control (Conditional Statements, Loops, Function Calls )

• Pseudocode

• Python Objects (Container or collection objects)

• list operations

• Tuple Operations

• Dictionary operations

• Built-in dictionary functions

• Loop through the dictionary object

• Sets Operations



Python Variable Types

Variables are names bound to objects in Python. Unlike statically typed languages, Python uses dynamic (or duck) typing: a variable name only refers to an object, and objects carry type information.

Variables are references to objects in memory. Types are tied to values (objects), not to variable names. You can inspect the identity using id() and the type using type(). Understanding that variables are references helps explain behavior when mutating objects (e.g., lists) vs. reassigning names.

# Examples: variable names referencing objects
a = 10                # 'a' references an int object with value 10
b = a                 # 'b' now references the same int object
print(id(a), id(b))   # same id for small ints due to caching (implementation detail)
a = a + 1             # rebind 'a' to a new int object (immutable)
print(a, b)
# Mutable object example
lst1 = [1, 2, 3]
lst2 = lst1            # both names reference the same list
lst2.append(4)
print(lst1)            # prints [1,2,3,4] because list is mutable and shared reference

Reassign Variable

Reassignment changes what object a variable name refers to. Because Python variables are labels for objects, reassigning doesn't mutate the original object (unless the object itself is mutated).

Because of dynamic typing, you can assign values of different types to the same variable name at different times. While flexible, excessive reassigning to different types can make code harder to read and error-prone—prefer consistent use within a given scope.

x = 42            # integer
print(type(x), x)
x = "now a string" # reassign to string
print(type(x), x)
# Reassigning complex objects
config = {"mode": "dev"}
config = ["dev", "staging", "prod"]  # same name now refers to a list

Python Data Types

Python provides several built-in data types: numeric (int, float, complex), sequence (str, list, tuple, range), mapping (dict), set, and boolean. There are also user-defined classes and many specialized types in the standard library.

Important differences: immutability vs mutability (e.g., tuples and strings are immutable; lists and dicts are mutable). Immutability provides safety in hashed collections and easier reasoning about state, while mutability is more memory-efficient for large, changing datasets.

# Numeric types

i = 10
f = 3.14
c = 1+2j
# Sequences
s = "hello"
lst = [1,2,3]
t = (1,2,3)
# Mapping and set
d = {"a": 1}
st = {1,2,3}

Create Numeric & String Variables

Creating numeric and string variables is straightforward. Python automatically assigns the appropriate object type.

Be mindful of numeric literal formats (binary 0b, octal 0o, hex 0x), underscores in numerics for readability, and raw strings for backslashes in paths or regex.

# Numeric literals and underscores

million = 1_000_000
binary_val = 0b1010
hex_val = 0xFF

# Strings and raw strings

normal = "Hello\nWorld"
raw = r"Hello\nWorld"  # backslash not interpreted

Checking Data Types with type() Function

Use type(obj) to get the class/type of an object. For type checking, prefer duck typing or isinstance(obj, Class) rather than comparing types directly; isinstance supports inheritance and is more flexible.

Note: type() comparison (type(x) is int) fails with subclasses; isinstance(x, numbers.Real) supports abstract base classes.

from numbers import Real


x = 3
print(type(x))                  # <class 'int'>
print(isinstance(x, int))       # True
print(isinstance(x, Real))      # True for int and float

Data Type Conversion / Type Casting

Python provides explicit conversion functions: int(), float(), str(), bool(), list(), tuple(), set(), dict(). Conversion may fail (ValueError) if the input string doesn’t represent the target type.

Implicit conversions also occur in numeric expressions (e.g., int + float -> float). Be deliberate with conversions: converting large floats to int truncates toward zero; converting non-numeric strings will raise exceptions.

# Examples of conversions

s = '123'
n = int(s)
f = float(s)
print(n + 10, f * 1.5)

# Converting containers

t = (1,2,3)
print(list(t))  # [1,2,3]

# Edge case: invalid conversion

try:
    int("12.3")   # ValueError: invalid literal
except ValueError as e:
    print("Can't convert:", e)

Functions in Python

Functions encapsulate reusable logic. Define with def or lambda (anonymous). Functions are objects and can be passed around, returned, and stored. Type hints (PEP 484) are optional but useful for documentation and tooling.

Key concepts: positional vs keyword arguments, default arguments, *args/**kwargs for variable arguments, closures, and decorators for higher-order behavior.

# Basic function with return and type hints

def add(a: int, b: int) -> int:
    """Return sum of a and b."""
    return a + b
print(add(2,3))

# Args and kwargs, default values

def greet(name, *titles, sep=' ', **attrs):
    print(f"Hello {sep.join((name,)+titles)}; attrs={attrs}")
greet("Dr. Alice", "PhD", sep=", ", role="instructor")

Commenting Code in Python

Use # for single-line comments. Triple-quoted strings are often used as docstrings for modules, classes, and functions (accessible via __doc__). Comments should explain why — not what — when the code is non-obvious.

Docstrings follow conventions (PEP 257) and are used by tools like Sphinx. Keep comments up-to-date; stale comments are worse than none.

# Single-line comment explaining intent

x = x + 1  # increment counter

def func():
    """Short docstring: describe function purpose, args, and return."""
    pass

Python Operators

Operators include arithmetic (+, -, *, /, //, %, **), comparison (==, !=, <, >), logical (and, or, not), assignment (=, +=), identity (is, is not), and membership (in, not in).

Operator precedence matters; use parentheses to clarify. The 'is' operator checks object identity, not equality; use '==' for value equality.

# Examples

a, b = 7, 3
print(a // b)   # floor division = 2
print(a ** b)   # exponentiation = 343
print(a == 7)   # True
print(a is 7)   # identity (implementation detail for small ints)

Runtime Variables (Input, Environment, CLI)

Runtime variables are values provided while a program runs: user input (input()), environment variables (os.environ), command-line arguments (sys.argv or argparse).

Always validate and sanitize runtime input; it can be malformed, missing, or malicious. For CLI programs, prefer argparse for robust argument parsing and help messages.

# Reading user input

name = input('Enter name: ')
age_s = input('Enter age: ')
try:
    age = int(age_s)
except ValueError:
    print('Invalid age')

# Reading environment variables

import os
db_url = os.getenv('DATABASE_URL', 'sqlite:///local.db')

Data Slicing (Strings, Lists, Ranges)

Slicing extracts subsequences: seq[start:stop:step]. Start is inclusive, stop is exclusive. Omitting indices uses defaults (start=0, stop=len). Negative indices count from end.

Slicing returns a new object for sequences like list and str. For large slices, avoid copying unnecessarily (use iterators) if performance/memory matters.

s = 'PythonProgramming'
print(s[0:6])     # 'Python'
print(s[6:])      # 'Programming'
print(s[-11:-1])  # using negative indices
print(s[::2])     # every second character
lst = [0,1,2,3,4,5]
print(lst[1:4])   # [1,2,3]
print(lst[::-1])  # reversed copy

String Handling

Strings are immutable sequences of Unicode code points. Common operations include concatenation, formatting (f-strings, format(), %), searching, splitting, and regular expressions. For heavy string building, use list join pattern to avoid quadratic behavior.

Be mindful of encoding when working with bytes and I/O: convert between str and bytes using encode/decode with a known encoding (utf-8 preferred).

# f-strings and formatting

name = 'Alice'
g = f'Hello, {name.upper()}'
print(g)

# Efficient building

parts = []
for i in range(5):
    parts.append(str(i))
result = ','.join(parts)
print(result)

# Bytes and encoding

text = 'café'
b = text.encode('utf-8')
print(b)
print(b.decode('utf-8'))

Python Flow Control (Conditionals, Loops, Function Calls)

Control flow directs execution through conditionals (if/elif/else) and loops (for/while). For loops iterate over iterables; while loops use a condition. Use break/continue to control loop flow.

Prefer 'for item in iterable' over index-based loops when possible. Use generator expressions and iterators for memory-efficient pipelines. Function calls encapsulate behavior and should be kept small (single responsibility).

# Conditional

x = 10
if x < 0:
    print('negative')
elif x == 0:
    print('zero')
else:
    print('positive')

# Loop and function call

def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(n**0.5)+1):
        if n % i == 0:
            return False
    return True
for num in range(2, 20):
    print(num, is_prime(num))

Pseudocode

Pseudocode is a language-agnostic way to express algorithms before coding. It helps plan edge cases, data structures, and complexity.

Good pseudocode focuses on clarity: describe intent with simple constructs. Once validated, translate into Python, keeping variable names consistent with the pseudocode.

# Example pseudocode for computing factorial:

# 1. Read n
# 2. result = 1
# 3. For i from 2 to n:
#       result = result * i
# 4. Print result

# Translated to Python:

def factorial(n):
    result = 1
    for i in range(2, n+1):
        result *= i
    return result

Python Objects (Container or Collection Objects)

Collection objects aggregate multiple values: lists, tuples, sets, dicts. Choose based on needs: orderable sequence (list), fixed collection (tuple), unique elements (set), key-value mapping (dict).

Understand time complexities: list append O(1) amortized, dict/set average O(1) for lookup/insert, list membership O(n). These influence data structure choice in algorithms.

# Examples

lst = [1,2,3]
tup = (1,2,3)
st = {1,2,3}
d = {'a':1, 'b':2}

List Operations

Lists are mutable ordered sequences. Common operations: append, extend, insert, pop, remove, clear, index, count, sort, reverse. Use list comprehensions for concise transformations.

Be cautious: list multiplication with nested lists can create shared references. For large data, prefer generator expressions to avoid memory spikes.

# Common list operations

fruits = ['apple', 'banana']
fruits.append('cherry')
fruits.extend(['date', 'elderberry'])
fruits.insert(1, 'blueberry')
print(fruits)
print(fruits.pop())   # remove last
print(fruits.remove('banana'))  # returns None, mutates list
# List comprehensions vs generator
squares = [i*i for i in range(10)]
gen = (i*i for i in range(10))

Tuple Operations

Tuples are immutable ordered sequences. They are useful as keys in dictionaries or to represent fixed records. You can unpack tuples into variables and use them as return values from functions.

Though immutable, a tuple can contain mutable objects (like lists), which can be mutated in place—only the tuple's structure is immutable.

# Tuple examples

record = ('Alice', 30, 'Engineer')
name, age, role = record  # unpacking
print(name, age, role)

# Tuple with mutable item

t = (1, [2,3], 4)
t[1].append(5)
print(t)

Dictionary Operations

Dictionaries map hashable keys to values. Keys must be immutable and hashable (strings, numbers, tuples of immutables). Common operations: access, setdefault, get, update, pop, popitem, clear.

For ordered behavior, since Python 3.7 insertion order is preserved. For specialized mappings, use collections.OrderedDict (older versions), defaultdict, or ChainMap.

# Basic dictionary usage

person = {'name': 'Bob', 'age': 25}
print(person['name'])
print(person.get('city', 'Unknown'))
person['age'] = 26
person.setdefault('country', 'India')

# Merging dictionaries (Python 3.9+)

a = {'x': 1}
b = {'y': 2}
merged = a | b
print(merged)

Built-in Dictionary Functions

Useful dict methods: keys(), values(), items(), get(), pop(), popitem(), clear(), update(), setdefault(). Many of these return view objects that reflect changes to the dict.

Iterating dicts: iterating directly yields keys; use items() to get key-value pairs. For performance, avoid repeated dict lookups in loops—cache lookups in local variables if needed.

d = {'a':1, 'b':2}
print(list(d.keys()))
print(list(d.values()))
print(list(d.items()))

# Using get with default

print(d.get('z', 0))


Loop through the Dictionary Object

Loop patterns: for key in d, for key, val in d.items(), and for val in d.values(). When mutating a dictionary while iterating, iterate over a copy (list(d.items())) to avoid runtime errors.

Use enumerate() when you need an index while iterating items; use sorted(d.items()) if you need deterministic order by key.

d = {'a':1, 'b':2, 'c':3}
for k, v in d.items():
    print(f'{k} -> {v}')

# Safe mutation example

for k in list(d.keys()):
    if k == 'b':
        del d[k]





Set Operations

Sets represent unordered collections of unique elements. They support union, intersection, difference, symmetric_difference, and subset/superset checks. Because sets are hash-based, elements must be hashable.

Use sets for deduplication, membership tests, and set algebra. frozenset provides an immutable variant usable as dict keys.

a = {1,2,3}
b = {3,4,5}
print('union', a | b)
print('intersection', a & b)
print('difference', a - b)
print('symmetric', a ^ b)

# Using sets to deduplicate a list

nums = [1,2,2,3,3,3]
unique = set(nums)
print(unique)

This is pretty basic excercise to do little practice to understand Python and to start with.

here is a jupiter notebook attached for your reference but i would suggest you try the code your self in next coming week, we will go little more deep into python in our upcoming Series posts



 
 
 

Recent Posts

See All
Automating Azure VM Deployment with Python

I recently completed a project automating Azure Infrastructure-as-Code (IaC) using Python's requests library and the Azure REST API. While tools like Terraform and Bicep are great, interacting directl

 
 
 

Comments


  • White YouTube Icon
  • White Facebook Icon
  • White Twitter Icon
  • White Instagram Icon

© 2024 All Rights Reserved

bottom of page