Is it possible to Extract Text from the specific area exactly？

September 2, 2018, 11:46 pm

≪ Previous: Add Reference To AxAcroPDFLib

I been trying use Adobe Acrobat SDK to extract text for many days ...

it is OK for me to get the whole text of the page , But those text basically is not ordered.

Normally content comes like top content , footer content , Main content

We don't get those text like we see those words orderly.

Another reason why I am asking is , we have used another 3rd party tools like PDFbox

With their tools , giving the specific area , it return text successfully . And unfortunately , this tools doesn't read pdf successfully.

And , adobe acrobat SDK read all pdf files well .

Now this is what I plainly to do

Giving a specific area , and return the text . Just Like we read pdf files , we select it and copy it.

Firtst . Is it possible to do that ?

Second . I used pdfDoc.CreateTextSelect(pageNumber, pdfRect);

This function return text which is not I want when those texts are in form or image .

I was giving the smaller pdfRect to CreateTextSelect function , but it finally return its own BoundingRect like the bigger one.

And Also , function return texts like : DQM0 L VDDQDQ8VSSVDDDQ7VSSQ M VSSQDQ10DQ9DQ6DQ5VDDQ N VSSQDQ12DQ14DQ1DQ3VDDQ P ...

Correct me if I am wrong , Is it possible to do that ? or Am I using the wrong method?

↧

detecting lines and remove them

September 2, 2018, 9:37 am

≫ Next: Fix position of form fields containing page numbers

≪ Previous: Is it possible to Extract Text from the specific area exactly？

Hello,

I have an InDesign plugin that draws lines over text when the page is being exported to PDF. These lines indicate modifications to the document, track changes. I would like to be able to remove, make them none printing, set the opacity to invisible from Acrobat. We could go back to InDesign and export the pages again with the lines / marks hidden, but there is the possibility that some might have changed in the document. We are not able to put these lines / marks on a separate layer.

Is this possible in Acrobat?

Is the SDK C++ plugin the best route?

Or, is it possible with javascript?

Thank you.

↧

Fix position of form fields containing page numbers

September 3, 2018, 6:18 am

≫ Next: How to do for outCertListCab store all chain certificate?

≪ Previous: detecting lines and remove them

Dear community,

I have an Excel VBA macro which adds page numbers to certain pdf documents.

Here an example of how this generally looks like:

	Set AcroApp = CreateObject("AcroExch.App")
	Set KurzGesamt = CreateObject("AcroExch.PDDoc")

	KurzGesamt.Open (strPfadVerteilungEndlauf & strNameKurzGesamt)
	Set jso = KurzGesamt.GetJSObject
	intSeiten = KurzGesamt.GetNumPages

	For i = 2 To intSeiten

	Set objTextfeld = jso.AddField("Textfeld" & i, "text", i - 1, Array(810, 15, 830, 25))
	objTextfeld.Value = Str(i)
	objTextfeld.textSize = 10
	objTextfeld.textFont = "Calibri"

	Next i

	jso.FlattenPages

	Call KurzGesamt.Save(1, strPfadVerteilungEndlauf & strNameKurzGesamt)
	KurzGesamt.Close

This code has worked perfectly for the last months but now, suddenly, I have the following problem:

Before, all form fields (containing the page numbers) were at the bottom on the right side of each page. (I got the "Array" values from above by trial and error.)

Now, the position is only correct on empty pages of the documents, while on the others, the page numbers are smaller and rather at the center of the page than on the right side.

It seems like font and position are adjusting to the respective page.

However, in the original pdfs nothing has changed and before, using the same code, this was not the case.

Therefore, my questions are:

1. Is there any setting in the Acrobat 7.0 I have to change in order to place the form fields independently from the remaining text and

2. if so, how can I define this as standard setting or change it via VBA?

3. If the problem has nothing to do with Acrobat settings, how must I change my VBA code to get the correct form field position and font size?

By the way, I've already tried to uninstall and reinstall Acrobat 7.0, because I thought, I'd maybe changed any setting accidentially, but this did not solve the problem.

Thank you in advance for any help!

↧

How to do for outCertListCab store all chain certificate?

September 3, 2018, 10:21 am

≫ Next: PDPageAcquirePDEContent Out of memory

≪ Previous: Fix position of form fields containing page numbers

I wish to know to do for outCertListCab of the PSSigSigPropParams store all certificate chain not only the first key, for show in the dialog box all certificates installed in the system.

↧

PDPageAcquirePDEContent Out of memory

November 18, 2011, 7:37 pm

≫ Next: Macintosh SDK samples build

≪ Previous: How to do for outCertListCab store all chain certificate?

Hi,

I have one page pdf document of size 1.5 GB in size. When i use PDPageAcquirePDEContent() on it, it gives outof memory exception. I also do see memory increasing dramatically in task manager. However Acrobat opens this pdf fine.

Spec says that this API caches the data, any workaround or ideas are greatly appreciated.

Thanks

Rajeev

↧

Macintosh SDK samples build

September 4, 2018, 4:59 am

≫ Next: Fillable Forms and VBA - Seeking Alternative to Acrobat Object

≪ Previous: PDPageAcquirePDEContent Out of memory

Hello,

I am having some trouble build the samples plugins for macintosh.

Looking at the BasicPlugin. There only seems to be a debug target. Where should I install the built debug plugin? In the Environment.xcconfig file there is an entry:

ACROBAT_PLUGINS_FOLDER = /Applications/Adobe Acrobat Next Pro/Adobe Acrobat Pro.app/Contents/Plug-ins

This must be for a previous version of Acrobat?

↧

Fillable Forms and VBA - Seeking Alternative to Acrobat Object

September 6, 2018, 8:54 am

≫ Next: How to suppress warning message when converting Pdf to Xml using Acrobat SDK InterApplication communication

≪ Previous: Macintosh SDK samples build

I am currently using Adobe Acrobat Pro 2017 and have created several Fillable Forms which our dealers use for various purposes. Once the Fillable Forms are received back to our Order Entry or Customer Service departments, I would like them to be able to pull the data from the form(s) into an Excel spreadsheet or Access database. I have been successful in using the Acrobat object in VBA to pull the data from each of the form fields and insert the data into a spreadsheet or database as needed. However, most of my users (20+) have only Adobe Reader. The Acrobat object is not installed or accessible with this version of Adobe. I am looking for other options to pull form field data and populate a Microsoft Office spreadsheet or database. I have not found an object such as Acrobat which my users with Adobe Reader would have access to. Is there one that I am missing? Should I be using something other than VBA with which I can access the fillable form fields?

Thank you so much. I look forward to some guidance as I've spent many hours trying to find a solution for this.

Debbie Poirier

↧

How to suppress warning message when converting Pdf to Xml using Acrobat SDK InterApplication communication

September 9, 2018, 11:41 pm

≫ Next: modify the UI

≪ Previous: Fillable Forms and VBA - Seeking Alternative to Acrobat Object

I have a background service written in C# that converts Pdf to Xml using Acrobat SDK Interapplication communication approach.

The problem is for some PDF the acrobat application shows a warning message window before saving the converted XML.

In such scenario the user need to interrupt and close the window manually. As its a background service i need to handle this warning message window programmatically in code, but i didn't find any solution in Acrobat SDK documentation to handle this .

Below is the sample screen shot with warning message while converting PDf to Xml.

↧

modify the UI

September 10, 2018, 6:17 am

≫ Next: NAMED DESTINATION

≪ Previous: How to suppress warning message when converting Pdf to Xml using Acrobat SDK InterApplication communication

Hello,

With the C SDK is it possible to modify the UI? I would like to have a floating palette with buttons and text areas.

↧

NAMED DESTINATION

September 11, 2018, 3:38 am

≫ Next: Custom URL format handlers (specifically iwl for HP Worksite)

≪ Previous: modify the UI

How to access name destination with spaces?

For ex: My named destination in pdf file is 'About This'.

The following does not work

<path of pdf>#nameddest=About This

↧

Custom URL format handlers (specifically iwl for HP Worksite)

September 12, 2018, 3:54 am

≫ Next: Need help with adding Bounding box / Trim box to multiple PDF files

≪ Previous: NAMED DESTINATION

Hi,

Similar to post Custom URL links in PDF don’t work. we have noticed that the Worksite link protocol iwl: is blocked in Reader DC (Windows 10 64-bit)

However, enabling the protocol in HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Adobe\Acrobat Reader\DC\FeatureLockDown\cDefaultLaunchURLPerms

with "|iwl:2" added to the key, merely launches the default web browser, not the URL handler specified for the iwl protocol in HKEY_CLASSES_ROOT

Is there a way to get the correct handler to be used in Reader?

thanks

↧

Need help with adding Bounding box / Trim box to multiple PDF files

September 13, 2018, 2:55 pm

≫ Next: ReferenceError: OptionsOnSec is not defined

≪ Previous: Custom URL format handlers (specifically iwl for HP Worksite)

We have PDF files that are generated using PHP to send to a printing company. The printing company want us to add Bounding Boxes in these PDFs but with the technologies we're using, there's no way to add them without having the box visible on print paper. Also, since the number of PDFs generated are dynamic, it'll take a lot of work for them to add the Bounding Boxes manually.

I'm new to Adobe Acrobat SDK, does anyone know if there's a way to achieve the following tasks:
1. Loop through each existing PDFs

2. Add / Identify the Bounding Box

3. Save file with newly added Bounding Box

4. Move to the next PDF

The PDFs each have 2 pages, each page is 336.35mm x 226.35 (with 3.175 mm Bleeds - 6.35mm total), Finished size is 330mm x 220mm. Also, assuming if the bounding box is successfully added, will it be visible in printing paper?

Thank you.

↧

ReferenceError: OptionsOnSec is not defined

September 14, 2018, 11:04 am

≫ Next: Handle unsupported digest algorithm in Adobe Reader plugin

≪ Previous: Need help with adding Bounding box / Trim box to multiple PDF files

Dear All,

I use Python to call Acrobat JavaScript to convert PDF to EXCEL. The code is like following:

        aApp = win32.Dispatch('AcroExch.App')          pdDoc = win32.Dispatch('AcroExch.PDDoc')          if pdDoc.Open(r + "\\" + e):              jsObject = pdDoc.GetJSObject()              jsObject.SaveAs(des + "\\" + e + ".xlsx", "com.adobe.acrobat.xlsx")              jsObject = None              pdDoc.Close()          pdDoc = None          aApp.Exit()          aApp = None

I have converted 6000+ PDF successfully without any problem.

But recently when I reused this code to conver another 4000+ PDF, for about 600 PDF, there were errors: "ReferenceError: OptionsOnSec is not defined". The other PDF are converted successfully without any problem.

I have searched google for this key word "OptionsOnSec" but only found "2 results (0.38 seconds) " and none of them are helpful. So I think maybe internal forum will be helpful.

Could anyone provide me with any advice?

Thank you so much for your help!

↧

Handle unsupported digest algorithm in Adobe Reader plugin

September 16, 2018, 11:23 pm

≫ Next: AcroPDF in Windows 10

≪ Previous: ReferenceError: OptionsOnSec is not defined

Hi there.

I was creating a plugin for signature validating. The problem is that signature uses third party digest algorithm "GOST34311". I think Adobe simply does not recognize such algorithm. I tried to catch the exception in this function:

void DSEngine::sigValidate( PSSigValidateParams params );

but it doesn't even get there, exception is thrown before.

So, how can I get algorithm name and handle any custom digest algorithm? Please, give some approach.

Thanks in advance!

↧

AcroPDF in Windows 10

September 18, 2018, 8:28 am

≫ Next: Auto detect input fields in a flat PDF via the SDK. Is this possible?

≪ Previous: Handle unsupported digest algorithm in Adobe Reader plugin

I am developing a new application and i am using the AcroPDF.dll active X control. On Windows 7 on my development machine the control works wonderfully. However when I deploy to Windows 10 the control is just blank. There are no errors or anything it is just blank. Adobe Reader is installed on both computers.

Any assistance is appreciated.

Thanks,

Robert

↧

Auto detect input fields in a flat PDF via the SDK. Is this possible?

September 19, 2018, 9:13 am

≫ Next: Is there an paid API from adobe for converting PDF to docx? We would like to subscribe, if any.

≪ Previous: AcroPDF in Windows 10

I have a flat PDF (no Acroform fields). Just a flat PDF. Is it possible, via the API, to get a list of possible input fields? (Text fields, check boxes, etc.) This is what I want, only delivered via an API call (see link->): Automatic Field Detection in Authoring

If the Adobe SDK does not do this, does anyone know of any API that does something like this?

↧

Is there an paid API from adobe for converting PDF to docx? We would like to subscribe, if any.

September 19, 2018, 10:11 pm

≫ Next: General information about Acrobat development

≪ Previous: Auto detect input fields in a flat PDF via the SDK. Is this possible?

We would like to convert pdf to docx / pdf to html using adobe acrobat via API.

Is there any API support for the same?

We would like to subscribe and integrate, if any.

↧

General information about Acrobat development

September 20, 2018, 9:10 pm

≫ Next: Looking for Interapplication Communication API Reference downloadable PDF for Acrobat X and/or XI

≪ Previous: Is there an paid API from adobe for converting PDF to docx? We would like to subscribe, if any.

Dear Community,

I am searching for a method to automate a working process with PDF files. The automation should be able to open a bunch of PDFs out of a folder, split them into single files and automatically search for a string and a date, combining the corresponding PDFs together und "safe as" the PDF files in another folder.

Is there any way to do so without the need of a key?

Do I need the SDK for that?

Can I simply script it in C++ as kind of a macro like in MS office?

Do I have to let it be certified by adobe?

Do I need to be certified to create such a routine?

What are my options in this case?

Thank you very much and regards

Mathieu

↧

Looking for Interapplication Communication API Reference downloadable PDF for Acrobat X and/or XI

July 6, 2018, 4:34 pm

≫ Next: How do I send suggestions to Adobe?

≪ Previous: General information about Acrobat development

Is there a downloadable PDF file for "Interapplication Communication API Reference" for version X or XI? I already have this document for version 8.0 of the SDK.

↧

How do I send suggestions to Adobe?

September 21, 2018, 10:52 am

≫ Next: Need help on Acrobat SDK Plugin Licensing

≪ Previous: Looking for Interapplication Communication API Reference downloadable PDF for Acrobat X and/or XI

I have some ideas for improving Adobe Acrobat and Adobe Acrobat reader. How do I contact Adobe with the ideas?

↧