Parse whole pdf-page to NSString on an iPhone

I’ve been trying to parse a pdf-page of text to NSString for a while now and the only thing I can find are methods to search for specific stringvalues.

What I’d like to do is parse a single page of PDF without using any external libraries such as PDFKitten, PDFKit etc.

  • swift prepareForSegue not working / exc_breakpoint (code=exc_i386_bpt subcode=0x0)
  • org.openqa.selenium.SessionNotCreatedException: A new session could not be created. (Original error: Requested a new session but one was in progress)
  • How to create an alert box in iphone?
  • UIImageView only displays when I call initWithImage
  • Is there any length limit of string stored in Keychain?
  • How can I create a 'share to Facebook button' in a SpriteKit game using swift?
  • I’d like to have the data in an NSArray, NSString or NSDictionary if possible.

    Thanks :D!

    A piece of what I’ve tried so far.

    CGPDFDocumentRef MyGetPDFDocumentRef (const char *filename) {
        CFStringRef path;
        CFURLRef url;
        CGPDFDocumentRef document;
        path = CFStringCreateWithCString (NULL, filename,kCFStringEncodingUTF8);
        url = CFURLCreateWithFileSystemPath (NULL, path, kCFURLPOSIXPathStyle, 0);
        CFRelease (path);
        document = CGPDFDocumentCreateWithURL (url);// 2
        CFRelease(url);
        int count = CGPDFDocumentGetNumberOfPages (document);// 3
        if (count == 0) {
            printf("`%s' needs at least one page!", filename);
            return NULL;
        }
        return document;
    }
    
    // table methods to parse pdf
    static void op_MP (CGPDFScannerRef s, void *info) {
        const char *name;
        if (!CGPDFScannerPopName(s, &name))
            return;
        printf("MP /%s\n", name);
    }
    
    static void op_DP (CGPDFScannerRef s, void *info) {
        const char *name;
        if (!CGPDFScannerPopName(s, &name))
            return;
        printf("DP /%s\n", name);
    }
    
    static void op_BMC (CGPDFScannerRef s, void *info) {
        const char *name;
        if (!CGPDFScannerPopName(s, &name))
            return;
        printf("BMC /%s\n", name);
    }
    
    static void op_BDC (CGPDFScannerRef s, void *info) {
        const char *name;
        if (!CGPDFScannerPopName(s, &name))
            return;
        printf("BDC /%s\n", name);
    }
    
    static void op_EMC (CGPDFScannerRef s, void *info) {
        const char *name;
        if (!CGPDFScannerPopName(s, &name))
            return;
        printf("EMC /%s\n", name);
    }
    
    void MyDisplayPDFPage (CGContextRef myContext,size_t pageNumber,const char *filename) {
        CGPDFDocumentRef document;
        CGPDFPageRef page;
        document = MyGetPDFDocumentRef (filename);// 1
        totalPages=CGPDFDocumentGetNumberOfPages(document);
        page = CGPDFDocumentGetPage (document, 1);// 2
    
        CGPDFDictionaryRef d;
    
        d = CGPDFPageGetDictionary(page);
    
        CGPDFScannerRef myScanner;
        CGPDFOperatorTableRef myTable;
        myTable = CGPDFOperatorTableCreate();
        CGPDFOperatorTableSetCallback (myTable, "MP", &op_MP);
        CGPDFOperatorTableSetCallback (myTable, "DP", &op_DP);
        CGPDFOperatorTableSetCallback (myTable, "BMC", &op_BMC);
        CGPDFOperatorTableSetCallback (myTable, "BDC", &op_BDC);
        CGPDFOperatorTableSetCallback (myTable, "EMC", &op_EMC);
    
        CGPDFContentStreamRef myContentStream = CGPDFContentStreamCreateWithPage (page);// 3
        myScanner = CGPDFScannerCreate (myContentStream, myTable, NULL);// 4
    
        CGPDFScannerScan (myScanner);// 5
    
        CGPDFStringRef str;
    
        d = CGPDFPageGetDictionary(page);
    
        if (CGPDFDictionaryGetString(d, "Lorem", &str)){
            CFStringRef s;
            s = CGPDFStringCopyTextString(str);
            if (s != NULL) {
                NSLog(@"%@ testing it", s);
            }
            CFRelease(s);
        }
    }
    
    - (void)viewDidLoad {
        [super viewDidLoad];
    
    
        MyDisplayPDFPage(UIGraphicsGetCurrentContext(), 1, [[[NSBundle mainBundle] pathForResource:@"TestPage" ofType:@"pdf"] UTF8String]);
    
    }
    

    Solutions Collect From Internet About “Parse whole pdf-page to NSString on an iPhone”

    Quartz provides functions that let you inspect the PDF document structure and the content stream. Inspecting the document structure lets you read the entries in the document catalog and the contents associated with each entry. By recursively traversing the catalog, you can inspect the entire document.

    A PDF content stream is just what its name suggests—a sequential stream of data such as ‘BT 12 /F71 Tf (draw this text) Tj . . . ‘ where PDF operators and their descriptors are mixed with the actual PDF content. Inspecting the content stream requires that you access it sequentially.

    This developer.apple documentation shows how to examine the structure of a PDF document and parse the contents of a PDF document.