Author Topic: Searching for text in a pdf...  (Read 4486 times)

0 Members and 1 Guest are viewing this topic.

Offline TecNik

  • Jr. Member
  • **
  • Posts: 98
  • Karma: 0
  • Gender: Male
Searching for text in a pdf...
« on: May 16, 2007, 05:35:24 AM »
Hi there,

Does anyone know if it's possible to search for text, in a pdf, that has certain attributes i.e. Bold, Green, Font etc. using either Applescript or Acrobat Javascript?

Thanks,

Nick

Offline rayl

  • Newbie
  • *
  • Posts: 16
  • Karma: 0
  • Gender: Male
    • totalworks
Re: Searching for text in a pdf...
« Reply #1 on: May 20, 2007, 07:34:56 AM »
Javascript  should work. Look at Evermap.com for good links, scripts and tools that might be useful to get started.
Raymond Lareine • MacBook CoreDuo • Mac OS 10.6.8 • Totalworks (retired)

Offline TecNik

  • Jr. Member
  • **
  • Posts: 98
  • Karma: 0
  • Gender: Male
Re: Searching for text in a pdf...
« Reply #2 on: May 21, 2007, 06:45:52 AM »
Thanks for the reply and for the link.

It's one I'd already found thorugh searches on this topic.

Thanks again,

Nick

Offline larsen67

  • Sr. Member
  • ****
  • Posts: 459
  • Karma: 10
  • Gender: Male
Re: Searching for text in a pdf...
« Reply #3 on: May 21, 2007, 06:50:16 AM »
What are you wanting to do with this info if you find it. Its just that you may be able to do this with preflight droplet. Are you looking for actual bold font or Faux bolding etc?

Offline TecNik

  • Jr. Member
  • **
  • Posts: 98
  • Karma: 0
  • Gender: Male
Re: Searching for text in a pdf...
« Reply #4 on: May 21, 2007, 06:55:52 AM »
I was hoping to write something for creating bookmarks from certain text on the page e.g. sub-headings etc.

I have some javascript for creating a bookmark for each page but not for subheadings on each page.
Whilst searching I'd also found a plugin for Acrobat, just wondered if I could do something with Applescript.

Regards,

Nick

Offline larsen67

  • Sr. Member
  • ****
  • Posts: 459
  • Karma: 10
  • Gender: Male
Re: Searching for text in a pdf...
« Reply #5 on: May 21, 2007, 08:11:27 AM »
Nick, back in the days of Acrobat 6. I did a script to create bookmarks from the PDF file names. I used to point at folders of PDFs to create chapters then join chapters to create whole PDF. I did this because Adobe apps would break the sort order they used to be (unix file access time) or something like that which should have been addressed in CS2. I posted my code at Adobe some time ago. In Acrobat 7 there is a feature that does almost the identical thing so I've dropped using the script in favour of this. I doubt very much they stole my idea. Have you used file/Create PDF/From Multiple Files… Its great for bookmarking you just name the PDF documents. Here is my old code if its any use to you.

001   -- Order your docs by number.name.pdf for example none alphabetical bookmarks
002   -- Your Bookmark is called what is between the 2 fullstops in the file name.
003   -- 001.Contents Page.pdf
004   -- 002.Zebras Page.pdf
005   -- 003.Lions Page.pdf
006   -- 004.Elephantnts Page.pdf
007   -- 005.Antelopes Page.pdf
008   -- 006.Hippos Page.pdf
009   -- 007.Index Page.pdf
010   -- To change the bookmark name to just use alphanumeric names set bookmark name to text item 1. (In the 3 places marked)
011   -- File name extensions are assumed.
012   --------------
013   set inputFolder to choose folder with prompt "Where is your folder of PDF files…" without invisibles
014   set filePath to (path to desktop from user domain) as string
015   --
016   tell application "Finder"
017        set filesList to files in inputFolder
018        set thePDFs to count of filesList -- Count the PDF's
019        set theFile to item 1 of filesList as alias
020   end tell
021   -- Open first PDF with visible
022   tell application "Adobe Acrobat 7.0 Pr#2CB915"
023        activate
024        open theFile
025        set docRefA to the name of document 1
026        -- Set the view of first document
027        set bounds of document 1 to {146, 42, 1198, 1200}
028        -- Watch the bookmarks fall in
029        set view mode of document 1 to pages and bookmarks
030        --
031        set ASTID to AppleScript's text item delimiters
032        set AppleScript's text item delimiters to "."
033        set BookName to text item 2 of docRefA -- Set the bookmark name here! (1)
034        set AppleScript's text item delimiters to ASTID
035        --
036        set PageCount to count of pages of document docRefA
037        if PageCount = 1 then
038             tell document 1
039                  make new bookmark at beginning with properties ¬
040                       {destination page number:{PageCount}, fit type:fit page, name:BookName}
041             end tell
042        end if
043        if PageCount > 1 then
044             tell document 1
045                  make new bookmark at beginning with properties ¬
046                       {fit type:fit page, name:BookName}
047             end tell
048        end if
049   end tell
050   -- Loop through the rest of the PDF's with invisible
051   repeat with i from 2 to thePDFs
052        tell application "Finder"
053             set theFile to item i of filesList as alias
054        end tell
055        tell application "Adobe Acrobat 7.0 Pr#2CB915"
056             activate
057             open theFile with invisible
058             set docRefB to the name of document 2
059             --
060             set ASTID to AppleScript's text item delimiters
061             set AppleScript's text item delimiters to "."
062             set BookName to text item 2 of docRefB -- Set the bookmark name here! (2)
063             set AppleScript's text item delimiters to ASTID
064             --
065             set AddPages to count of pages of document docRefB
066             insert pages document docRefA after PageCount from document docRefB starting with 1 number of pages AddPages with insert bookmarks
067             if AddPages = 1 then
068                  tell document 1
069                       make new bookmark at end with properties ¬
070                            {destination page number:{(PageCount + 1)}, fit type:fit page, name:BookName}
071                  end tell
072             end if
073             close document 2 saving no
074        end tell
075   end repeat
076   --
077   tell application "Adobe Acrobat 7.0 Pr#2CB915"
078        create thumbs document 1
079        -- Strip down the names of imported multi-page PDF's
080        repeat with j from 1 to (count of bookmarks)
081             set thisBookName to name of bookmark j as string
082             --
083             set ASTID to AppleScript's text item delimiters
084             set AppleScript's text item delimiters to "."
085             if (count of text items of thisBookName) > 1 then
086                  set BookName to text item 2 of thisBookName -- Set the bookmark name here! (3)
087                  set AppleScript's text item delimiters to ASTID
088                  set name of bookmark j to BookName
089             end if
090             --
091        end repeat
092        save document 1 to file ((filePath & "Compiled_Pages.pdf") as string) with linearize
093        -- Optimize here (manually) and resave or
094        -- close document 1 saving no
095   end tell

Offline TecNik

  • Jr. Member
  • **
  • Posts: 98
  • Karma: 0
  • Gender: Male
Re: Searching for text in a pdf...
« Reply #6 on: May 21, 2007, 08:39:38 AM »
Hi Mark,

Thanks for your reply and for the code, very neat. It was a nice way of doing it.
Yeah, I've used the file/Create PDF/From Multiple Files… method which certainly speeds things up.
I was just trying to put something together that can create book marks from certain text within the pdf.
I'll have another trawl through the scripting guide.

Thanks again,

Nick