root/feedmelinks/import/Docs.txt

Revision 639, 3.1 kB (checked in by hirokai, 4 years ago)

done and working! literall, the first time. i feel great.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
Line 
1 rationale
2 ======================================================
3
4 Importing bookmarks (or favorites) is important because users can
5 preserve their existing data -- what good is a solution that forces
6 users to throw away all their work? No way, man!
7
8
9 implementation plan
10 ======================================================
11
12 0. user arrives at import screen, sees a picture of the 3 step process:
13    "upload, confirm, import!". user uploads the bookmarks file.
14
15 1. we parse the bookmarks file, ignoring any unparseables, and we
16    generate two "prep files": LINKS.RAW, a file of url / name / taglist rows,
17    and TAGS.RAW, a de-duped list of all tags in the file.  (we do this
18    conversion because the raw bookmark file is a pain to iterate over
19    and we'll need to traverse through this input data more than once.)
20
21         > log: parsed N links from U's bookmarks
22
23 2. bounds-check the number of link and tag insertions (line-count
24          of LINKS.RAW & TAGS.RAW) to detect attacks from D.O.S. dorks
25
26 3. randomly sample (both contiguous run and scatter) the links to
27    check for uniqueness against other links in the imported bookmarks
28    file as well as uniqueness against the user's existing FML links
29    (to detect possible link spammers)
30
31         > log: any possible violations from 2, 3
32
33 [if OK to proceed, i.e. no grievous errors...]
34
35 4. now we exhaustively loop over LINKS.RAW(looking at just the
36          links) and TAGS.RAW against the user's links in the FML
37          database, and stream the files back out to disk, marking
38          an X before any link or tag found in the DB. Save the new
39          files as LINKS.READY and TAGS.READY
40
41  > log: number of import dupes for U
42
43 5. display the links to be imported by looping over LINKS.READY
44          (with a link & tag count), and showing all the links with any
45          duplicates in gray, and let the user OK the import. (remind
46          them that all these links will be private by default.)
47
48 6. using the prep files, generate the proper SQL commands to INSERT
49          link, tag, and link<->tag rows and run it. ideally
50          double-check the generated SQL for any glaring security
51          (SQL injection) bugs (might be good to run the SQL on a
52          temp table and look for errors, although that's no guarantee)
53
54         > log: any SQL anomolies as possible d.o.s. dingus attempt by user U with code C
55
56 7. remove any temp/prep files
57
58 8. teach user how to manage the new, imported links; design a nice
59    screen to show the user how to make their imported links public
60    and re-tag as necessary
61
62
63
64
65
66
67 uncaught weaknesses: vulnerable to link-name spamming until we check those for dupes too
68
69 vulnerable to query string spam
70
71
72
73
74
75
76 a few words about the netscape bookmarks file format
77 ======================================================
78
79 when you see <A HREF="ANYTHING_BUT_UNESCAPED_DOUBLE_QUOTE"
80         this is a link.
81         if there is a currently open link, close it and add it.
82        
83 when you see a <DD>STUFF^WHITESPACE<DT>
84         this is a description, grab "STUFF" and add it as a comment the current link
85         this link is now done. add it.
86
87 when you see <H3 ADD_DATE=
88         this is a new folder. add it.
89
90 future
91 ======================================================
92 how do you parse apple's .plist files?
Note: See TracBrowser for help on using the browser.