markdown 蟒蛇
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了markdown 蟒蛇相关的知识,希望对你有一定的参考价值。
Spyter 快捷键
ctrl + 鼠标点到build in function 可打开该function的source code
[python multi processing](https://abcdabcd987.com/python-multiprocessing/)
***
- Convert the string date column in dataframe to datatime column
'2019/03/22' -> datetime.datetime(2019, 3, 22, 0, 0)
```python
### method 1
s = pd.to_datetime(data['Expiration Date'], format='%Y/%m/%d')
### method 2
s = [datetime.strptime(date, '%m/%d/%Y') for date in data['Expiration Date']]
```
- strptime is able to parse non-zero padded values.
```python
datetime.strptime("3/1/2014 9:55", "%m/%d/%Y %H:%M")
output: datetime.datetime(2014, 3, 1, 9, 55)
```
***
- Get current working directory
```python
import os
os.getcwd()
import sys
sys.path.append('C:\\Users\\SABR')
```
## dataframe
- Print list without brackets in a single row names = ["Sam", "Peter", "jaes"] ---> Sam, Peter, Jaes
```python
', '.join(names)
```
- count the nan values in a dataframe columns
```python
len(df) - df['a'].count()
```
- Split dataframe by groupby
```python
gb = df.groupby('underlying')
[gb.get_group(x) for x in gb.groups]
```
- get a list of groupby keys
```python
list(gb.groups) # method 1
gb.groups.keys() # method 2
```
create dictionary (key: the group name value: the group dataframe)
```python
data_dict = {x: gb.get_group(x) for x in gb.groups}
data = data_dict[list(data_dict.keys())[1]]
```
***
- Filter rows containing a string pattern from a dataframe
```python
data = data[data['Ticker'].str.contains("SPXW")]
```
- Filter None value in dataframe
```python
df_c = df[(df['right'] == 'C') & (df['iv'].notnull())]
df_c = df[df['iv'].isnull()]
```
- Assign the same value to a column in a dataframe which satisfies the same conditions based on other columns.
- For example, assign a forward price to dataframe where the expiry equals to a specific value
```python
data["forward"] = np.nan
data["pv_factor"]= np.nan
for time_to_expiry, df in data_dict.items():
NTM_index = list(abs(df['mid_diff']).sort_values()[:10].index) # Index of 10 pairs near-the-money option contracts
result = fit_forward(ntm_mkt_diff = df['mid_diff'][NTM_index],
ntm_strikes = df['strike'][NTM_index],
params0 = [1, np.mean(df['mid_diff'] + df['strike'])],
bounds = ((0.5, 2), (min(df['strike']), max(df['strike']))) )
data.loc[data['time_to_expiry'] == time_to_expiry, "pv_factor"] = result[0]
data.loc[data['time_to_expiry'] == time_to_expiry, "forward"] = result[1]
```
***
- Split a pandas column into two based on a delimiter that may not exist on all values
Use str.split with parameter expand=True for return DataFrame. n is the number of splits to do; default is -1 which splits all the items.
```python
df[['Hedge Acct', 'Security']] = df['Security'].str.split('/',n=1, expand=True)
```
- Pandas add column with value based on condition based on other columns
[stackoverflow](https://stackoverflow.com/questions/50375985/pandas-add-column-with-value-based-on-condition-based-on-other-columns)
```python
import pandas as pd
import numpy as np
d = {'age' : [21, 45, 45, 5],
'salary' : [20, 40, 10, 100]}
df = pd.DataFrame(d)
# add an extra column called "is_rich" which captures if a person is rich depending on his/her salary
# method 1
df['is_rich_method1'] = np.where(df['salary']>=50, 'yes', 'no')
# method 2
df['is_rich_method2'] = ['yes' if x >= 50 else 'no' for x in df['salary']]
# method 3
df['is_rich_method3'] = 'no'
df.loc[df['salary'] > 50,'is_rich_method3'] = 'yes'
```
- In Pandas, indexing a DataFrame returns a reference to the initial DataFrame. Thus, changing the subset will change the initial DataFrame. Thus, you'd want to use the copy if you want to make sure the initial DataFrame shouldn't change.
-
- How to get indices of a sorted array in Python (series.sort_values returns a series with the sorted index)
```python
data['mid_diff']).sort_values()
```
- dataframe apply with if and else statement
```python
from pandas import DataFrame
Numbers = {'set_of_numbers': [1,2,3,4,5,6,7,8,9,10]}
df = DataFrame(Numbers,columns=['set_of_numbers'])
df['equal_or_lower_than_4'] = df['set_of_numbers'].apply(lambda x: 'True' if x <= 4 else 'False')
print (df)
```
***
- Filter dictionary by the value
```python
diff ={'a': 2, 'b': 1}
roots = list(filter(lambda x:abs(diff[x]) < 10**(-10), diff.keys()))
roots = {key: value for key. value in diff.items() if value < 10**(-10)}
```
## Optimum approach for iterating over a DataFrame
1. [Optimum approach for iterating over a DataFrame](https://medium.com/@rtjeannier/pandas-101-cont-9d061cb73bfc)
2. [Apply a function to every row in a pandas dataframe](http://jonathansoma.com/lede/foundations/classes/pandas%20columns%20and%20functions/apply-a-function-to-every-row-in-a-pandas-dataframe/)
- index object to other types
```python
df.index.values
Out[289]:
array(['2795', '2800', '2805', '2810', '2815', '2820', '2825', '2830',
'2835', '2840', '2845', '2850', '2855', '2860', '2865', '2870',
'2875', '2880', '2885', '2890', '2895', '2900', '2905', '2910',
'2915', '2920', '2925', '2930', '2935', '2940', '2945', '2950',
'2955', '2960', '2965', '2970', '2975', '2980', '2985', '2990'],
dtype=object)
df_blg.index.astype(int)
Out[297]:
Int64Index([2795, 2800, 2805, 2810, 2815, 2820, 2825, 2830, 2835, 2840, 2845,
2850, 2855, 2860, 2865, 2870, 2875, 2880, 2885, 2890, 2895, 2900,
2905, 2910, 2915, 2920, 2925, 2930, 2935, 2940, 2945, 2950, 2955,
2960, 2965, 2970, 2975, 2980, 2985, 2990],
dtype='int64')
```
- dataframe to a list of dictionary
```python
df.to_dict(orient='records')
```
- dataframe to a list of tuple
```python
list(map(tuple, df.values))
```
- python pickle
```python
d = {"a": 1, "b": 2}
with open(r"someobject.p", "wb") as output_file:
pickle.dump(d, output_file)
```
- Matplotlib Plot Lines with Colors Through Colormap
```python
import numpy as np
import matplotlib.pylab as pl
x = np.linspace(0, 2*np.pi, 64)
y = np.cos(x)
pl.figure()
pl.plot(x,y)
n = 20
colors = pl.cm.jet(np.linspace(0,1,n))
for i in range(n):
pl.plot(x, i*y, color=colors[i])
```
- read csv with different sheets
```python
xlsm = pd.ExcelFile('data/Macro_v1.12_20Jun2019.xlsm')
pos = pd.read_excel(xlsm, 'Real Time') # 'Real Time' is the sheet name
```
***
SQL python
```python
conn = pymssql.connect(host='CRDBTTERSDEV01\SQLDEV01', database='ERS')
cur = conn.cursor()
query = "select * from ERS.TRADER.stickyStrikeVol where valuedate = '2019-6-6' and current_run = 1"
df = pd.read_sql(query, conn)
```
```python
import pymssql
conn = pymssql.connect(host='CRDBTTERSDEV01\SQLDEV01', database='ERS')
cur = conn.cursor()
qry = 'select distinct strike from Trader.openint3 ' \
' where valuedate = (select max(valuedate) from Trader.OpenInt3 where underlying = \'' +underlying + '\' and valuedate <= \'' + vdate.strftime('%Y-%m-%d') + '\')' \
' and underlying = \'' + underlying + '\' and expiry = \'' + expDt.strftime('%Y-%m-%d') + '\'' \
' and tag = \'' + underlying + '\'' \
' and underlying = ' + '\'' + underlying + '\' order by strike asc;'
result = doQry(qry, True)
def doQry(qry, echo=True):
try:
cur.execute(qry)
if echo:
results=cur.fetchall()
return results
else:
conn.commit()
except:
with open("db_log.txt", "a+") as f:
f.write(datetime.now().strftime('%Y/%m/%d %H:%M:%S') + ' -- ' + qry + "\n")
```
# Plot
- Add space between subplo
```python
pivot_citi = pd.pivot_table(citi_vol, values='vol_value', columns='expiry', index='strike')
pivot_svi = volSurface(svi_estimate, pivot_citi.index, pivot_citi.columns, spot)
plt.figure(figsize = (30, 28))
plt.subplots_adjust(hspace=0.3)
for i in range(0, len(pivot_citi.columns)):
plt.subplot(ceil(len(pivot_citi.columns)/4), 4, i+1)
expiry = pivot_citi.columns[i]
plt.scatter(pivot_citi.loc[:, expiry].index, pivot_citi.loc[:, expiry].values, label='citi vol', s=2, c='b')
plt.scatter(pivot_svi.loc[:, expiry].index, pivot_svi.loc[:, expiry].values, label='svi vol', s=2, c='r')
plt.legend()
plt.xlabel('strike')
plt.ylabel('iVol')
plt.title('Expiry %.2f'%(expiry))
```
- Do not display the plot
```python
import matplotlib
matplotlib.use('Agg')
```
- Save the figure
```python
fig = plt.figure(figsize = (33, 28))
fig.savefig('pic/'ITM.png')
- warning: too many open figures
```python
import matplotlib.pyplot as plt
plt.close('all')
```
# SLSQP drawbacks
##### https://stackoverflow.com/questions/21765794/python-constrained-non-linear-optimization
While the SLSQP algorithm in scipy.optimize.minimize is good, it has a bunch of limitations. The first of which is it's a QP solver, so it works will for equations that fit well into a quadratic programming paradigm. But what happens if you have functional constraints?
## catch error mssage
```python
import logging
logger = logging.Logger('catch_all')
try:
# do something
except Exception as e:
logger.error(e, exc_info=True)
```
以上是关于markdown 蟒蛇的主要内容,如果未能解决你的问题,请参考以下文章