markdown URL Shortener
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了markdown URL Shortener相关的知识,希望对你有一定的参考价值。
sho.rt
## Features
**full web app?**
no, start with an API
**authentication?**
no, start with it open
**modify or delete links?**
no, leave it out for now
**persist links forever?**
- could remove links *created* after some time (6 months)
- could remove links not *visited* after some time (6 months)
it would suck if used on a private site or printed on paper, so let's keep it forever
**let people choose their shortlink or auto-generate?**
yeah, let's support it
**analytics for click tracking?**
no, let's leave it out for now
## Design Goals
1. be able to store a lot of links
2. shortlinks should be real short
3. redirect should be fast
4. shortlink should be resillient to heavy load
## Data Model
main object will be a `Link`
```
Link
- shortLink
- longLink
```
`shortLink` can just contain the "slug" instead of the full link so rename the field to `slug`
```
Link
- slug
- longLink
```
`longLink` makes less sense without `shortLink` so rename to `destination`
```
Link
- slug
- destination
```
rename the model to `ShortLink`
```
ShortLink
- slug
- destination
```
## Views/Pages/Endpoints
let's use REST API style with versioning so our endpoint is:
**sho.rt/api/v1/shortlink**
send a POST request to create a new `ShortLink` with the `destination` arg with an optional `slug` arg
```
$ curl --data '{"destination": "soundcloud.com"}' https://sho.rt/api/v1/shortlink
{
"slug": "ae8uFt",
"destination": "soundcloud.com"
}
```
in REST style, we should allow GET, PUT, PATCH, DELETE with read, modify, delete links but for now we'll reject non-POST requests with an error 501 ("not implemented")
so the endpoint is something like:
```
function shortlink(request) {
if (request.method !== 'POST') {
return response(501);
}
const destination = request.data.destination;
let slug = request.data.slug;
if (typeof slug === 'undefined') {
slug = generateRandomSlug();
}
DB.insertLink(slug, destination);
const responseBody = JSON.stringify({'slug': slug});
return response(200, responsBody);
}
```
will need to figure out for random generation:
1. what chars we can use in the randomly generated slug (what is allowed in a URL?)
2. make sure randomly generated slug hasn't already been used or if there's a collision, how to handle
**let's make a way to follow a shortlink**
give the format: **sho.rt/$slug**
might want to reserve or block certain slugs to reserve those pages for our site pages
for redirect:
```
function redirect(request) {
const destination = DB.getLinkDestination(request.path);
return response(302, destination);
}
```
## Slug Generation
brainstorm around design goals and then design around those goals
1. we should be able to store a lot of links
2. our shortlinks should be as short as possible
the more characters we allow in our shortlinks, the more different shortlinks we could have without making longer shortlinks
if we allow *c* different characters, for *n* character long slugs, we have *c<sup>n</sup>* possibilities
1. figure out max set of characters we can allow
2. figure out how many possible shortlinks we want to accommodate
3. figure out how long shortlinks need to be to accommodate all the possibilities
## What characters can we allow in our randomly-generated slugs?
what are constraints on *c*
1. only use characters allowed in URLs
2. only pick characters that are easy to type on a keyboard
can google the answer to allowed characters in URLs (defined in an RFC):
"only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL"
the path portion of a URL is case-sensitive so google.com/foo is different than google.com/Foo
so the set of allowed characters is:
```
A-Z, a-z, 0-9, and "$-_.+!*'(),"
```
ease of typing will involve pulling all the special characters out of the list
users may want to use special characters in their slugs so maybe allow for those (still no apostrophe)
for A-Z, a-z, 0-9, there are 26 + 26 + 10 = 62 possible characters for randomly-generated slugs
and ("$-_.+!*(),") for another 10 characters in user-specified slugs for 72 total
## How many distinct slugs do we need?
good question to ask the interviewer
**how many new slugs could be created a day?**
100,000 per minute? 100,000 * 60 * 24 ~ 145 million new links a day which is 52.5 billion a year
100 years seems like "almost forever" so 5.2 trillion slugs seems sufficiently large
## How short can we make our slugs while still getting enough distinct possibilities?
for *c<sup>n</sup>* slugs with a 62-character alphabet, we want 62<sup>n</sup> ~ 5 trillion
we can plug this into wolfram alpha to get ~ 7.09
for 72<sup>n</sup> = 5.2 trillion we get n ~ 6.8
to accommodate the full 5.2 trillion slugs, we'd round up to an 8-character slug for 62-character alphabet and 7-character slug for 72-character alphabet
5.2 trilliong was a high estimate so let's remove special characters and choose 7 characters for our slugs
going down to 6 characters would be 2 orders of magnitude less and up to 8 characters would be 2 orders of magnitude more
以上是关于markdown URL Shortener的主要内容,如果未能解决你的问题,请参考以下文章
SharePoint 2010 Url Shortener --SharePoint 2010 短URL生成器
HTML/PHP URL Shortener 在 URL 之后 / 在 URL 中通过 ID 重定向
从 URL Shortener 的自定义位置值表示法生成唯一 ID
Google URL Shortener API 总是返回与 pagetoken 相同的页面
sh Bash:goo.gl#使用Google URL Shortener服务(http://goo.gl)缩短网址。
来自 Python AppEngine 的 Google Url Shortener API:HTTPError:HTTP 错误 403:禁止