퀀트/Python

python Numpy 총정리

만 기 2023. 3. 14. 23:49

Numpy¶

수치를 다루는 분야를 위한 파이썬 패키지. Numerical Python 의 약자이다.
Array 또는 Matrix(행렬)와 같은 자료구조를 다룬다.

Numpy 사용하기¶

🔻Numpy 설치와 호출

In [1]:

!pip install numpy

import numpy as np

🔻특징 1. n차원 배열 ndarray 객체이다.

행렬 연산과 비슷한 성분별 계산이 가능하다.
빠르고 유연한 자료형이다.

In [20]:

# 배열 생성
data1 = [1, 2, 3, 4, 5]
arr1 = np.array(data1)

data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)

arr1, arr2

Out[20]:

(array([1, 2, 3, 4, 5]),
 array([[1, 2, 3, 4],
        [5, 6, 7, 8]]))

In [3]:

# 난수 생성
data = np.random.randn(2, 3)    # randn : 가우시안 표준 정규 분포에서 난수 생성, (행, 열)
data

Out[3]:

array([[-1.45113073,  0.66797394, -0.90863637],
       [ 0.21630006,  0.06665821,  1.58449292]])

In [19]:

# 0 또는 1 또는 원하는 값으로 생성
display(np.zeros(5), np.ones((2, 3)), np.full((4, 2), 7))

array([0., 0., 0., 0., 0.])

array([[1., 1., 1.],
       [1., 1., 1.]])

array([[7, 7],
       [7, 7],
       [7, 7],
       [7, 7]])

In [13]:

np.empty((2, 9))

Out[13]:

array([[6.23042070e-307, 3.56043053e-307, 1.60219306e-306,
        7.56571288e-307, 1.89146896e-307, 1.37961302e-306,
        1.05699242e-307, 8.01097889e-307, 1.78020169e-306],
       [7.56601165e-307, 1.02359984e-306, 1.33510679e-306,
        2.22522597e-306, 1.24611674e-306, 1.29061821e-306,
        6.23057349e-307, 1.86920193e-306, 9.34608432e-307]])

In [16]:

np.eye(3, 3)

Out[16]:

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

🔻ndarray의 특징 알기

In [42]:

data = [[1, 2, 'a', 4.2], ['b', 'c', -5, 3.8]]
ndarr = np.array(data)

# 특징 확인
display(ndarr)
print(len(ndarr))    # 2 (첫번째 차원의 갯수)
print(ndarr.size)    # 8
print(ndarr.ndim)    # 2
print(ndarr.shape)   # (2, 4)
print(ndarr.dtype)   # <U32
print(np.unique(ndarr))    # ['-5' '1' '2' '3.8' '4.2' 'a' 'b' 'c']

# 특징 변경
ndarr = ndarr.astype(np.string_)    # 문자열 데이터 타입으로 변경
ndarr = ndarr.reshape(4, 2)         # 형태 변경
display(ndarr)

array([['1', '2', 'a', '4.2'],
       ['b', 'c', '-5', '3.8']], dtype='<U32')

2
8
2
(2, 4)
<U32
['-5' '1' '2' '3.8' '4.2' 'a' 'b' 'c']

array([[b'1', b'2'],
       [b'a', b'4.2'],
       [b'b', b'c'],
       [b'-5', b'3.8']], dtype='|S32')

🔻특징 2. 벡터화 계산(Vectorization)

넘파이 배열간 연산은 반복문을 사용하지 않고서도 성분끼리 연산이 가능하다.
크기가 다른 배열간에도 연산이 가능하다.(Broadcasting)

In [44]:

arr = np.array([[1., 2., 3.], [4., 5., 6.]])

arr * arr

Out[44]:

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

In [57]:

arr2 = np.array([10, 2, 0])

arr - arr2

Out[57]:

array([[-9.,  0.,  3.],
       [-6.,  3.,  6.]])

🔻특징 3. 다차원 행과 열의 Indexing, Slicing

파이썬 리스트 슬라이스와 다르게 새로운 객체가 만들어지지 않고 원래 배열의 위치의 값도 변경된다.

In [72]:

arr = np.arange(10)
r_arr = arr[5:8]

r_arr[0] = 100
arr[6] = 200

display(arr, r_arr)

array([  0,   1,   2,   3,   4, 100, 200,   7,   8,   9])

array([100, 200,   7])

🔻따라서 배열을 복사해서 사용하는 copy(deepcopy) 메소드를 사용한다.

In [74]:

arr = np.arange(10)
r_arr = arr[5:8].copy()

r_arr[0] = 100
arr[6] = 200

display(arr, r_arr)

array([  0,   1,   2,   3,   4,   5, 200,   7,   8,   9])

array([100,   6,   7])

🔻고차원 배열은 원소가 스칼라가 아니고 벡터가 된다.

In [87]:

arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])

print(arr3d[1][0][2], arr3d[1, 0, 2])

# 3차원배열 슬라이싱을 사용한 인덱싱
arr3d[:, :, :2]

9 9

Out[87]:

array([[[ 1,  2],
        [ 4,  5]],

       [[ 7,  8],
        [10, 11]]])

In [91]:

arr = np.arange(32).reshape(8, 4)
display(arr)

arr[[1, 5, 7, 2], [0, 3, 1, 2]]

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

Out[91]:

array([ 4, 23, 29, 10])

In [90]:

# boolean 값으로 인덱싱
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randn(7, 4)

display(data)
print(names == 'Bob')

display(data[names == 'Bob', 2:])

cond = (names == 'Bob') | (names == 'Will')

data[cond] = 0
data

array([[ 0.16518781,  1.27010629,  0.33970517,  0.41172312],
       [-0.00856509,  1.30207202, -0.6496603 , -0.31269057],
       [-1.16598769,  0.8049016 ,  0.90335947, -0.32290462],
       [-0.63757725, -0.41729453, -0.31546959,  0.28063586],
       [ 1.90147904,  1.554558  ,  1.29400689,  0.04032651],
       [ 1.27844914, -0.01802216,  1.06063193, -0.99515425],
       [ 1.42693273, -0.76091266, -0.31879032,  0.66178183]])

[ True False False  True False False False]

array([[ 0.33970517,  0.41172312],
       [-0.31546959,  0.28063586]])

Out[90]:

array([[ 0.        ,  0.        ,  0.        ,  0.        ],
       [-0.00856509,  1.30207202, -0.6496603 , -0.31269057],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 1.27844914, -0.01802216,  1.06063193, -0.99515425],
       [ 1.42693273, -0.76091266, -0.31879032,  0.66178183]])

🔻Transpose 메소드를 사용하여 배열 전치 및 축 교환 (T)

In [ ]:

arr = np.arange(15).reshape((3, 5))

display(arr, arr.T)

arr.dot(arr.T)    # 행렬 곱 메서드 dot

🔻배열 연산 함수

단항함수	Description
abs, fabs	정수, 부동소수점, 복소수형의 절대값 계산
sqrt	제곱근 계산(arr ** 0.5과 동일)
square	제곱 계산(arr ** 2과 동일)
exp	e^x 계산
log, log10, log2, log1p	각각 자연로그(밑이 e), 밑이 10인 로그, 밑이 2인 로그, 및 log(1 + x) 계산
sign	각 성분의 부호 계산: 1 (양수), 0 (영), 또는 -1 (음수)
ceil	성분보다 크거나 같은 정수 중 가장 작은 정수(천장함수)
floor	성분보다 작거나 같은 정수 중 가장 큰 정수(바닥함수)
rint	Round elements to the nearest integer, preserving the dtype
modf	Return fractional and integral parts of array as a separate array
isnan	Return boolean array indicating whether each value is NaN (Not a Number)
isfinite, isinf	Return boolean array indicating whether each element is finite (non-inf, non-NaN) or infinite, respectively
cos, cosh, sin, sinh, tan, tanh	삼각함수 및 쌍곡 삼각함수들
arccos, arccosh, arcsin, arcsinh, arctan, arctanh	역 삼각함수들
logical_ not	Compute truth value of not x element-wise (equivalent to ~arr).

이항함수	Description
add	Add corresponding elements in arrays
subtract	Subtract elements in second array from first array
multiply	Multiply array elements
divide, floor_d ivide	Divide or floor divide (truncating the remainder)
power	Raise elements in first array to powers indicated in second array
maximum, fmax	Element-wise maximum; fmax ignores NaN
minimum, fmin	Element-wise minimum; fmin ignores NaN
mod	Element-wise modulus (remainder of division)
copysign	Copy sign of values in second argument to values in first argument
greater, greater_equal, less, less_eq ual, equal, not_equ al	Perform element-wise comparison, yielding boolean array (equivalent to infix operators >, >=, <, <=, ==, !=)
logical_and, logical_or, logical_xor	Compute element-wise truth value of logical operation (equivalent to infix operators &	, ^)y

In [32]:

arr = np.arange(10)

# 단항함수 unary function
display(np.sqrt(arr), np.exp(arr))

# 이항함수
np.add(arr, arr)    # 두 배열 합

rng = np.random.RandomState(123)    # RandomState 사용해서 일정한 랜덤 값을 객체로 사용
x = rng.randn(10)
y = rng.randn(10)
display(x, y)

np.maximum(x, y)    # 두 배열 중 최댓값

rem_part, int_part = np.modf(x)
display(rem_part, int_part)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ])

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03])

array([-1.0856306 ,  0.99734545,  0.2829785 , -1.50629471, -0.57860025,
        1.65143654, -2.42667924, -0.42891263,  1.26593626, -0.8667404 ])

array([-0.67888615, -0.09470897,  1.49138963, -0.638902  , -0.44398196,
       -0.43435128,  2.20593008,  2.18678609,  1.0040539 ,  0.3861864 ])

array([-0.0856306 ,  0.99734545,  0.2829785 , -0.50629471, -0.57860025,
        0.65143654, -0.42667924, -0.42891263,  0.26593626, -0.8667404 ])

array([-1.,  0.,  0., -1., -0.,  1., -2., -0.,  1., -0.])

🔻meshgrid 함수를 이용해서 1차원 자료로 x, y 좌표 계산

In [37]:

points = np.arange(-5, 5, 0.01)
# display(points)

x, y = np.meshgrid(points, points)
display(x, y)

array([[-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       ...,
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99],
       [-5.  , -4.99, -4.98, ...,  4.97,  4.98,  4.99]])

array([[-5.  , -5.  , -5.  , ..., -5.  , -5.  , -5.  ],
       [-4.99, -4.99, -4.99, ..., -4.99, -4.99, -4.99],
       [-4.98, -4.98, -4.98, ..., -4.98, -4.98, -4.98],
       ...,
       [ 4.97,  4.97,  4.97, ...,  4.97,  4.97,  4.97],
       [ 4.98,  4.98,  4.98, ...,  4.98,  4.98,  4.98],
       [ 4.99,  4.99,  4.99, ...,  4.99,  4.99,  4.99]])

🔻조건을 이용한 배열 : where

In [44]:

xarr = np.array([1.1, 1.2, 1.3, 1.4, 1.5])
yarr = np.array([2.1, 2.2, 2.3, 2.4, 2.5])
cond = np.array([True, False, True, True, False])

# cond 가 True면, xarr를 . 아니면 yarr를
# 기존 방식
result = [x if c else y for x, y, c in zip(xarr, yarr, cond)]

# where 사용
result2 = np.where(cond, xarr, yarr) # (조건, 참일때, 아닐때)


arr = np.random.randn(4, 4)
np.where(arr > 0, 2, arr)    # arr 배열 중 양수만 2로 바꾸기

Out[44]:

array([[-1.79041903, -0.04314963, -1.01499544,  2.        ],
       [-0.54022656,  2.        , -0.86275418, -0.44261949],
       [ 2.        ,  2.        , -0.7141492 , -0.02911694],
       [-1.17001393, -0.94759364,  2.        , -1.12460835]])

🔻수학 및 통계(집계) 메소드

Method	Description
sum	Sum of all the elements in the array or along an axis; zero-length arrays have sum 0
mean	Arithmetic mean; zero-length arrays have NaN mean
std, var	Standard deviation and variance, respectively, with optional degrees of freedom adjustment (default denominator n)
min, max	Minimum and maximum
argmin, argmax	Indices of minimum and maximum elements, respectively
cumsum	Cumulative sum of elements starting from 0
cumprod	Cumulative product of elements starting from 1

In [48]:

arr = np.random.randn(5, 4)

# 두가지 방식 모두 동일
arr.mean()
np.mean(arr)

# axis=0 : 첫번째 축 고정 , axis=1 : 두번째 축 고정
np.sum(arr, axis=0)
arr.sum(axis=1)

# cum- 누적 값 표시
arr.cumsum(axis=1)

Out[48]:

array([[-0.21920479,  0.53805856,  0.73000529,  0.83701849],
       [-1.44083335, -1.67964609, -0.86891142,  1.50331982],
       [-0.90601866, -1.71843216, -2.76814632, -2.54910539],
       [ 1.6441969 ,  1.9889915 ,  0.32789883,  1.79154451],
       [ 2.01855553,  2.00297183,  2.56570024,  2.21073258]])

🔻논리 배열 메소드 : any, all

In [51]:

arr = np.random.randn(10)

display(arr)

print(any(arr > 0), all(arr > 0))

array([ 1.2803467 , -0.33178594,  0.6037433 , -0.03406027,  0.26539354,
       -0.41919542, -0.92583522,  0.2123503 , -0.85288787,  2.07504041])

True False

🔻다차원 배열 정렬 : sort

In [54]:

arr = np.random.randn(3, 5)
display(arr)

arr.sort(axis=1)
display(arr)

array([[-1.84135336,  0.87521296, -0.26409867,  0.90424928, -0.11759999],
       [-1.35606042,  0.9335901 , -0.26804056, -0.94326899,  0.75789299],
       [-0.68060814,  0.91820999,  0.38092156, -0.91052124, -0.85421632]])

array([[-1.84135336, -0.26409867, -0.11759999,  0.87521296,  0.90424928],
       [-1.35606042, -0.94326899, -0.26804056,  0.75789299,  0.9335901 ],
       [-0.91052124, -0.85421632, -0.68060814,  0.38092156,  0.91820999]])

🔻배열의 집합 연산

Method	Description
unique(x)	Compute the sorted, unique elements in x
intersect1d (x, y)	Compute the sorted, common elements in x and y
union1d(x, y)	Compute the sorted union of elements
in1d(x, y)	Compute a boolean array indicating whether each element of x is contained in y
setdiff1d(x , y)	Set difference, elements in x that are not in y
setxor1d(x, y)	Set symmetric differences; elements that are in either of the arrays, but not both

In [58]:

names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])

# 중복되지 않은 원소
display(np.unique(names))

# 한 배열의 원소들이 기준 배열에 속했는지 판단
display(np.in1d(names, ['Bob', 'Will']))

array(['Bob', 'Joe', 'Will'], dtype='<U4')

array([ True, False,  True,  True,  True, False, False])

🔻선형대수

Function	Description
diag	Return the diagonal (or off-diagonal) elements of a square matrix as a 1D array, or convert a 1D array into a square matrix with zeros on the off-diagonal
dot	Matrix multiplication
trace	Compute the sum of the diagonal elements
det	Compute the matrix determinant
eig	Compute the eigenvalues and eigenvectors of a square matrix
inv	Compute the inverse of a square matrix
pinv	Compute the Moore-Penrose pseudo-inverse of a matrix
qr	Compute the QR decomposition
svd	Compute the singular value decomposition (SVD)
solve	Solve the linear system Ax = b for x, where A is a square matrix
lstsq	Compute the least-squares solution to Ax = b

In [61]:

A = np.array([[1., 2., 3.], [4., 5., 6.]])
B = np.array([[6., 23.], [-1, 7], [8, 9]])

# 행렬 곱 (세가지 방식 모두 동일)
display(A.dot(B))
display(np.dot(A, B))
display(A @ B)

array([[ 28.,  64.],
       [ 67., 181.]])

array([[ 28.,  64.],
       [ 67., 181.]])

array([[ 28.,  64.],
       [ 67., 181.]])

🔻난수 생성

Function	Description
seed	Seed the random number generator
permutation	Return a random permutation of a sequence, or return a permuted range
shuffle	Randomly permute a sequence in-place
rand	Draw samples from a uniform distribution
randint	Draw random integers from a given low-to-high range
randn	Draw samples from a normal distribution with mean 0 and standard deviation 1 (MATLAB-like interface)
binomial	Draw samples from a binomial distribution
normal	Draw samples from a normal (Gaussian) distribution
beta	Draw samples from a beta distribution
chisquare	Draw samples from a chi-square distribution
gamma	Draw samples from a gamma distribution
uniform	Draw samples from a uniform [0, 1) distribution

In [65]:

# 표준 정규 분포
samples = np.random.normal(size=(4, 4))
display(samples)

# RandomState 메소드를 사용하여 일정한 변수 생성
rng = np.random.RandomState(1234)    # 입력 숫자에 따라 변수가 다르지만 해당 숫자로 생성된 변수 값은 동일하다.
display(rng.randn(10))

array([[-1.24608141,  1.0200661 , -1.20283232, -0.74973068],
       [ 0.49086181,  0.09410266,  1.44198742, -1.23622175],
       [ 0.59406138, -1.46019353,  0.72229379, -1.0719993 ],
       [ 0.52649228, -0.65087108,  0.7893664 , -0.18004207]])

array([ 0.47143516, -1.19097569,  1.43270697, -0.3126519 , -0.72058873,
        0.88716294,  0.85958841, -0.6365235 ,  0.01569637, -2.24268495])

In [5]:

from IPython.core.display import display, HTML

display(HTML("<style>.container { width:90% !important; }</style>"))

#창 맞추기위함

'퀀트 > Python' 카테고리의 다른 글

pangres를 사용해서 pandas로 mysql에 upsert하기. (0)	2023.06.25
Anaconda 와 Jupyter Notebook 설치하여 Python 실행 환경 만들기 (0)	2023.03.07

현재글python Numpy 총정리

만기레벨업일지

감정은 사라지고 결과는 남는다.

.corr(), Plot, 금융공학, docker, Python, django orm, 머신러닝, 스파르타코딩클럽, 전처리, pandas, DRF, jwt, HTML, 퀀트, CNN, 확률, flask, django app, Django, CSS,

Today :
Yesterday :

만기레벨업일지

python Numpy 총정리

Numpy¶

Numpy 사용하기¶

'퀀트 > Python' 카테고리의 다른 글

'퀀트/Python'의 다른글

티스토리툴바

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

python Numpy 총정리

Numpy¶

Numpy 사용하기¶

'퀀트 > Python' 카테고리의 다른 글

'퀀트/Python'의 다른글

관련글

티스토리툴바