In programming languages, a data structure is a way to store and manipulate data efficiently.
In this chapter 5, we will take a deeper look at lists, tuples, etc., and cover various techniques such as data structures like sets and dictionaries, as well as looping techniques and conditional statements.
1. More on Lists
List: A mutable sequence structure that can store, add, delete, and sort values of various types in order.
1) List Methods
List Methods: A collection of built-in functions that perform addition, deletion, searching, sorting, etc., of list elements.
list.pop(0) is O(n): all front elements must be shifted → inefficient.
deque.popleft() is O(1): fast for both front and back insertions/deletions.
Optimal for inter-thread sharing, event handling, and sequential processing.
AI Application Examples
Data Buffering: Sequentially processing sensor data, logs, and time-series data.
Experience Replay Buffer (Reinforcement Learning): Learning sequentially from the oldest experiences.
BFS (Breadth-First Search): Graph traversal, finding the shortest path, etc.
Batch Processing Queue: When processing training samples, job requests, etc., in order.
4) List Comprehensions
Comprehension: A syntax for creating a new list by concisely iterating, filtering, and mapping an existing sequence.
squares = [x**2 for x in range(10)]
positives = [x for x in range(10) if x % 2 == 0]
Insight
Optimal balance of readability and performance.
Frequently used in data preprocessing.
5) Nested List Comprehensions
Nested Comprehension: A form that includes another comprehension inside a list comprehension.
matrix = [
[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
]
# Transpose
transposed = [[row[i] for row in matrix] for i in range(4)]
# Or
transposed = list(zip(*matrix))
2. The del statement
del: A statement that deletes elements of a variable or data structure to unbind the name-object binding.
a = [1,2,3,4,5]
del a[0] # [2,3,4,5]
del a[1:3] # [2,5]
del a # Raises NameError afterwards
3. Tuples and Sequences
Tuple: An immutable ordered collection of data that cannot be changed after creation.
Creation: (1,2,3) or 1,2,3
Characteristic: Immutable → can be used as dictionary keys or elements in a set.
Packing/Unpacking:
t = 123, 456, 'abc'
x, y, z = t
Empty tuple: ()
Tuple with one element: ('hello',)
Insight
Suitable for grouping heterogeneous data.
student = ("John Doe", 21, True)
Immutability → ensures data safety.
4. Sets
Set: A collection of unique elements that supports fast membership testing and set operations.
basket = {'apple','orange','pear','banana'}
a = set('abracadabra')
b = set('alacazam')
print(a - b, a | b, a & b, a ^ b)
# Difference, Union, Intersection, Symmetric Difference
Set Comprehension: {x for x in iterable if cond}
Insight
Ensures data uniqueness.
Useful for text corpus processing and deduplication.
5. Dictionaries
Dictionary: A mapping structure of key-value pairs.
Creation: {'jack':4098,'sape':4139} or dict(jack=4098,sape=4139)
Keys must be of an immutable type; insertion order is maintained (Python 3.7+).
Basic Operations:
tel = {'jack':4098,'sape':4139}
tel['guido'] = 4127
del tel['sape']
'jack' in tel
Dictionary Comprehension:
{x: x**2 for x in (2,4,6)}
Insight
Used for hyperparameter and label mapping.
Fast lookup structure based on hashing.
6. Looping Techniques
Looping Techniques: Methods for efficiently iterating over sequences and mappings using various built-in functions.
for k,v in dict.items()
for i,v in enumerate(seq)
for x,y in zip(seq1,seq2)
reversed(), sorted()
Avoid in-place modification:
import math
data = [56.2, float('nan'), 51.7]
filtered = [v for v in data if not math.isnan(v)]
7. More on Conditions
Conditional Expression: Controls code flow by combining comparison and logical operations.
Comparison operators: <, >, ==, !=, in, is
Chaining comparisons: a < b == c
Logical operators: and, or, not (short-circuit evaluation)
Walrus operator (:=, Python 3.8+):
if (n := len(data)) > 10:
print(n)
Insight
Specifying a default value (x or default)
Preventing duplicate calculations.
8. Comparing Sequences and Other Types
Sequence Comparison: Compares elements lexicographically based on their insertion order.